Matrix Completion

picture info	Matrix Completion Matrix completion is the task of filling in the missing entries of a partially observed matrix, which is equivalent to performing data imputation in statistics. A wide range of datasets are naturally organized in matrix form. One example is the movie-ratings matrix, as appears in the Netflix problem: Given a ratings matrix in which each entry (i,j) represents the rating of movie j by customer i, if customer i has watched movie j and is otherwise missing, we would like to predict the remaining entries in order to make good recommendations to customers on what to watch next. Another example is the document-term matrix: The frequencies of words used in a collection of documents can be represented as a matrix, where each entry corresponds to the number of times the associated term appears in the indicated document. Without any restrictions on the number of degrees of freedom in the completed matrix this problem is underdetermined since the hidden entries could be assigned arbitrary ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Coupon Collector's Problem In probability theory, the coupon collector's problem describes "collect all coupons and win" contests. It asks the following question: If each box of a brand of cereals contains a coupon, and there are ''n'' different types of coupons, what is the probability that more than ''t'' boxes need to be bought to collect all ''n'' coupons? An alternative statement is: Given ''n'' coupons, how many coupons do you expect you need to draw with replacement before having drawn each coupon at least once? The mathematical analysis of the problem reveals that the expected number of trials needed grows as \Theta(n\log(n)). For example, when ''n'' = 50 it takes about 225E(50) = 50(1 + 1/2 + 1/3 + ... + 1/50) = 224.9603, the expected number of trials to collect all 50 coupons. The approximation n\log n+\gamma n+1/2 for this expected number gives in this case 50\log 50+50\gamma+1/2 \approx 195.6011+28.8608+0.5\approx 224.9619. trials on average to collect all 50 coupons. Solution Calcul ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Principal Component Analysis Principal component analysis (PCA) is a popular technique for analyzing large datasets containing a high number of dimensions/features per observation, increasing the interpretability of data while preserving the maximum amount of information, and enabling the visualization of multidimensional data. Formally, PCA is a statistical technique for reducing the dimensionality of a dataset. This is accomplished by linearly transforming the data into a new coordinate system where (most of) the variation in the data can be described with fewer dimensions than the initial data. Many studies use the first two principal components in order to plot the data in two dimensions and to visually identify clusters of closely related data points. Principal component analysis has applications in many fields such as population genetics, microbiome studies, and atmospheric science. The principal components of a collection of points in a real coordinate space are a sequence of p unit vectors, where the ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Condition Number In numerical analysis, the condition number of a function measures how much the output value of the function can change for a small change in the input argument. This is used to measure how sensitive a function is to changes or errors in the input, and how much error in the output results from an error in the input. Very frequently, one is solving the inverse problem: given f(x) = y, one is solving for ''x,'' and thus the condition number of the (local) inverse must be used. In linear regression the condition number of the moment matrix can be used as a diagnostic for multicollinearity. The condition number is an application of the derivative, and is formally defined as the value of the asymptotic worst-case relative change in output for a relative change in input. The "function" is the solution of a problem and the "arguments" are the data in the problem. The condition number is frequently applied to questions in linear algebra, in which case the derivative is straightforward b ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Polylogarithmic Function In mathematics, a polylogarithmic function in is a polynomial in the logarithm of , : a_k (\log n)^k + a_ (\log n)^ + \cdots + a_1(\log n) + a_0. The notation is often used as a shorthand for , analogous to for . In computer science, polylogarithmic functions occur as the order of time or memory used by some algorithms (e.g., "it has polylogarithmic order"). All polylogarithmic functions of are for every exponent (for the meaning of this symbol, see small o notation), that is, a polylogarithmic function grows more slowly than any positive exponent. This observation is the basis for the soft O notation Big ''O'' notation is a mathematical notation that describes the limiting behavior of a function when the argument tends towards a particular value or infinity. Big O is a member of a family of notations invented by Paul Bachmann, Edmund Land ... . References * Mathematical analysis Polynomials Analysis of algorithms {{comp-sci-theory-stub ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Banach Space In mathematics, more specifically in functional analysis, a Banach space (pronounced ) is a complete normed vector space. Thus, a Banach space is a vector space with a metric that allows the computation of vector length and distance between vectors and is complete in the sense that a Cauchy sequence of vectors always converges to a well-defined limit that is within the space. Banach spaces are named after the Polish mathematician Stefan Banach, who introduced this concept and studied it systematically in 1920–1922 along with Hans Hahn and Eduard Helly. Maurice René Fréchet was the first to use the term "Banach space" and Banach in turn then coined the term " Fréchet space." Banach spaces originally grew out of the study of function spaces by Hilbert, Fréchet, and Riesz earlier in the century. Banach spaces play a central role in functional analysis. In other areas of analysis, the spaces under study are often Banach spaces. Definition A Banach space is a com ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Semidefinite Programming Semidefinite programming (SDP) is a subfield of convex optimization concerned with the optimization of a linear objective function (a user-specified function that the user wants to minimize or maximize) over the intersection of the cone of positive semidefinite matrices with an affine space, i.e., a spectrahedron. Semidefinite programming is a relatively new field of optimization which is of growing interest for several reasons. Many practical problems in operations research and combinatorial optimization can be modeled or approximated as semidefinite programming problems. In automatic control theory, SDPs are used in the context of linear matrix inequalities. SDPs are in fact a special case of cone programming and can be efficiently solved by interior point methods. All linear programs and (convex) quadratic programs can be expressed as SDPs, and via hierarchies of SDPs the solutions of polynomial optimization problems can be approximated. Semidefinite programming has been ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Singular Value In mathematics, in particular functional analysis, the singular values, or ''s''-numbers of a compact operator T: X \rightarrow Y acting between Hilbert spaces X and Y, are the square roots of the (necessarily non-negative) eigenvalues of the self-adjoint operator T^T (where T^ denotes the adjoint of T). The singular values are non-negative real numbers, usually listed in decreasing order (''σ''1(''T''), ''σ''2(''T''), …). The largest singular value ''σ''1(''T'') is equal to the operator norm of ''T'' (see Min-max theorem). If ''T'' acts on Euclidean space \Reals ^n, there is a simple geometric interpretation for the singular values: Consider the image by T of the unit sphere; this is an ellipsoid, and the lengths of its semi-axes are the singular values of T (the figure provides an example in \Reals^2). The singular values are the absolute values of the eigenvalues of a normal matrix ''A'', because the spectral theorem can be applied to obtain unitary diagonaliza ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Norm (mathematics) In mathematics, a norm is a function from a real or complex vector space to the non-negative real numbers that behaves in certain ways like the distance from the origin: it commutes with scaling, obeys a form of the triangle inequality, and is zero only at the origin. In particular, the Euclidean distance of a vector from the origin is a norm, called the Euclidean norm, or 2-norm, which may also be defined as the square root of the inner product of a vector with itself. A seminorm satisfies the first two properties of a norm, but may be zero for vectors other than the origin. A vector space with a specified norm is called a normed vector space. In a similar manner, a vector space with a seminorm is called a ''seminormed vector space''. The term pseudonorm has been used for several related meanings. It may be a synonym of "seminorm". A pseudonorm may satisfy the same axioms as a norm, with the equality replaced by an inequality "\,\leq\," in the homogeneity axiom. It ca ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Convex Function In mathematics, a real-valued function is called convex if the line segment between any two points on the graph of the function lies above the graph between the two points. Equivalently, a function is convex if its epigraph (the set of points on or above the graph of the function) is a convex set. A twice-differentiable function of a single variable is convex if and only if its second derivative is nonnegative on its entire domain. Well-known examples of convex functions of a single variable include the quadratic function x^2 and the exponential function e^x. In simple terms, a convex function refers to a function whose graph is shaped like a cup \cup, while a concave function's graph is shaped like a cap \cap. Convex functions play an important role in many areas of mathematics. They are especially important in the study of optimization problems where they are distinguished by a number of convenient properties. For instance, a strictly convex function on an open set has n ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Clustering High-dimensional Data Clustering high-dimensional data is the cluster analysis of data with anywhere from a few dozen to many thousands of dimensions. Such high-dimensional spaces of data are often encountered in areas such as medicine, where DNA microarray technology can produce many measurements at once, and the clustering of text documents, where, if a word-frequency vector is used, the number of dimensions equals the size of the vocabulary. Problems Four problems need to be overcome for clustering in high-dimensional data: * Multiple dimensions are hard to think in, impossible to visualize, and, due to the exponential growth of the number of possible values with each dimension, complete enumeration of all subspaces becomes intractable with increasing dimensionality. This problem is known as the curse of dimensionality. * The concept of distance becomes less precise as the number of dimensions grows, since the distance between any two points in a given dataset converges. The discrimination of the near ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	NP-Hard In computational complexity theory, NP-hardness ( non-deterministic polynomial-time hardness) is the defining property of a class of problems that are informally "at least as hard as the hardest problems in NP". A simple example of an NP-hard problem is the subset sum problem. A more precise specification is: a problem ''H'' is NP-hard when every problem ''L'' in NP can be reduced in polynomial time to ''H''; that is, assuming a solution for ''H'' takes 1 unit time, ''H''s solution can be used to solve ''L'' in polynomial time. As a consequence, finding a polynomial time algorithm to solve any NP-hard problem would give polynomial time algorithms for all the problems in NP. As it is suspected that P≠NP, it is unlikely that such an algorithm exists. It is suspected that there are no polynomial-time algorithms for NP-hard problems, but that has not been proven. Moreover, the class P, in which all problems can be solved in polynomial time, is contained in the NP class. Def ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]