Latent Semantic Indexing

	Latent Semantic Indexing Latent semantic analysis (LSA) is a technique in natural language processing, in particular distributional semantics, of analyzing relationships between a set of documents and the terms they contain by producing a set of concepts related to the documents and terms. LSA assumes that words that are close in meaning will occur in similar pieces of text (the distributional hypothesis). A matrix containing word counts per document (rows represent unique words and columns represent each document) is constructed from a large piece of text and a mathematical technique called singular value decomposition (SVD) is used to reduce the number of rows while preserving the similarity structure among columns. Documents are then compared by cosine similarity between any two columns. Values close to 1 represent very similar documents while values close to 0 represent very dissimilar documents. An information retrieval technique using latent semantic structure was patented in 1988US Patent 4,839, ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Natural Language Processing Natural language processing (NLP) is an interdisciplinary subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language, in particular how to program computers to process and analyze large amounts of natural language data. The goal is a computer capable of "understanding" the contents of documents, including the contextual nuances of the language within them. The technology can then accurately extract information and insights contained in the documents as well as categorize and organize the documents themselves. Challenges in natural language processing frequently involve speech recognition, natural-language understanding, and natural-language generation. History Natural language processing has its roots in the 1950s. Already in 1950, Alan Turing published an article titled "Computing Machinery and Intelligence" which proposed what is now called the Turing test as a criterion of intelligence, t ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Vector Space Model Vector space model or term vector model is an algebraic model for representing text documents (and any objects, in general) as vectors of identifiers (such as index terms). It is used in information filtering, information retrieval, indexing and relevancy rankings. Its first use was in the SMART Information Retrieval System. Definitions Documents and queries are represented as vectors. :d_j = ( w_ ,w_ , \dotsc ,w_ ) :q = ( w_ ,w_ , \dotsc ,w_ ) Each dimension corresponds to a separate term. If a term occurs in the document, its value in the vector is non-zero. Several different ways of computing these values, also known as (term) weights, have been developed. One of the best known schemes is tf-idf weighting (see the example below). The definition of ''term'' depends on the application. Typically terms are single words, keywords, or longer phrases. If words are chosen to be the terms, the dimensionality of the vector is the number of words in the vocabulary (the number of dist ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Frobenius Norm In mathematics, a matrix norm is a vector norm in a vector space whose elements (vectors) are matrices (of given dimensions). Preliminaries Given a field K of either real or complex numbers, let K^ be the -vector space of matrices with m rows and n columns and entries in the field K. A matrix norm is a norm on K^. This article will always write such norms with double vertical bars (like so: \, A\, ). Thus, the matrix norm is a function \, \cdot\, : K^ \to \R that must satisfy the following properties: For all scalars \alpha \in K and matrices A, B \in K^, \, A\, \ge 0 (''positive-valued'') \, A\, = 0 \iff A=0_ (''definite'') \left\, \alpha A\right\, =\left, \alpha\ \left\, A\right\, (''absolutely homogeneous'') \, A+B\, \le \, A\, +\, B\, (''sub-additive'' or satisfying the ''triangle inequality'') The only feature distinguishing matrices from rearranged vectors is multiplication. Matrix norms are particularly useful if they are also sub-multiplicative: \left\, ... [...More Info...] [...Related Items...] OR:* [Wikipedia] [Google] [Baidu]
picture info	Eigenvector In linear algebra, an eigenvector () or characteristic vector of a linear transformation is a nonzero vector that changes at most by a scalar factor when that linear transformation is applied to it. The corresponding eigenvalue, often denoted by \lambda, is the factor by which the eigenvector is scaled. Geometrically, an eigenvector, corresponding to a real nonzero eigenvalue, points in a direction in which it is stretched by the transformation and the eigenvalue is the factor by which it is stretched. If the eigenvalue is negative, the direction is reversed. Loosely speaking, in a multidimensional vector space, the eigenvector is not rotated. Formal definition If is a linear transformation from a vector space over a field into itself and is a nonzero vector in , then is an eigenvector of if is a scalar multiple of . This can be written as T(\mathbf) = \lambda \mathbf, where is a scalar in , known as the eigenvalue, characteristic value, or characteristic root ass ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Diagonal Matrix In linear algebra, a diagonal matrix is a matrix in which the entries outside the main diagonal are all zero; the term usually refers to square matrices. Elements of the main diagonal can either be zero or nonzero. An example of a 2×2 diagonal matrix is \left begin 3 & 0 \\ 0 & 2 \end\right/math>, while an example of a 3×3 diagonal matrix is \left begin 6 & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end\right/math>. An identity matrix of any size, or any multiple of it (a scalar matrix), is a diagonal matrix. A diagonal matrix is sometimes called a scaling matrix, since matrix multiplication with it results in changing scale (size). Its determinant is the product of its diagonal values. Definition As stated above, a diagonal matrix is a matrix in which all off-diagonal entries are zero. That is, the matrix with ''n'' columns and ''n'' rows is diagonal if \forall i,j \in \, i \ne j \implies d_ = 0. However, the main diagonal entries are unrestricted. The term ''diagonal matrix'' may s ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Orthogonal Matrix In linear algebra, an orthogonal matrix, or orthonormal matrix, is a real square matrix whose columns and rows are orthonormal vectors. One way to express this is Q^\mathrm Q = Q Q^\mathrm = I, where is the transpose of and is the identity matrix. This leads to the equivalent characterization: a matrix is orthogonal if its transpose is equal to its inverse: Q^\mathrm=Q^, where is the inverse of . An orthogonal matrix is necessarily invertible (with inverse ), unitary (), where is the Hermitian adjoint (conjugate transpose) of , and therefore normal () over the real numbers. The determinant of any orthogonal matrix is either +1 or −1. As a linear transformation, an orthogonal matrix preserves the inner product of vectors, and therefore acts as an isometry of Euclidean space, such as a rotation, reflection or rotoreflection. In other words, it is a unitary transformation. The set of orthogonal matrices, under multiplication, forms the group , known as the orthogonal gr ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Matrix Product In mathematics, particularly in linear algebra, matrix multiplication is a binary operation that produces a matrix from two matrices. For matrix multiplication, the number of columns in the first matrix must be equal to the number of rows in the second matrix. The resulting matrix, known as the matrix product, has the number of rows of the first and the number of columns of the second matrix. The product of matrices and is denoted as . Matrix multiplication was first described by the French mathematician Jacques Philippe Marie Binet in 1812, to represent the composition of linear maps that are represented by matrices. Matrix multiplication is thus a basic tool of linear algebra, and as such has numerous applications in many areas of mathematics, as well as in applied mathematics, statistics, physics, economics, and engineering. Computing matrix products is a central operation in all computational applications of linear algebra. Notation This article will use the following notati ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Correlation In statistics, correlation or dependence is any statistical relationship, whether causal or not, between two random variables or bivariate data. Although in the broadest sense, "correlation" may indicate any type of association, in statistics it usually refers to the degree to which a pair of variables are ''linearly'' related. Familiar examples of dependent phenomena include the correlation between the height of parents and their offspring, and the correlation between the price of a good and the quantity the consumers are willing to purchase, as it is depicted in the so-called demand curve. Correlations are useful because they can indicate a predictive relationship that can be exploited in practice. For example, an electrical utility may produce less power on a mild day based on the correlation between electricity demand and weather. In this example, there is a causal relationship, because extreme weather causes people to use more electricity for heating or cooling. However ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Dot Product In mathematics, the dot product or scalar productThe term ''scalar product'' means literally "product with a scalar as a result". It is also used sometimes for other symmetric bilinear forms, for example in a pseudo-Euclidean space. is an algebraic operation that takes two equal-length sequences of numbers (usually coordinate vectors), and returns a single number. In Euclidean geometry, the dot product of the Cartesian coordinates of two vectors is widely used. It is often called the inner product (or rarely projection product) of Euclidean space, even though it is not the only inner product that can be defined on Euclidean space (see Inner product space for more). Algebraically, the dot product is the sum of the products of the corresponding entries of the two sequences of numbers. Geometrically, it is the product of the Euclidean magnitudes of the two vectors and the cosine of the angle between them. These definitions are equivalent when using Cartesian coordinates. In mo ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Polysemy Polysemy ( or ; ) is the capacity for a sign (e.g. a symbol, a morpheme, a word, or a phrase) to have multiple related meanings. For example, a word can have several word senses. Polysemy is distinct from ''monosemy'', where a word has a single meaning. Polysemy is distinct from homonymy—or homophony—which is an accidental similarity between two or more words (such as ''bear'' the animal, and the verb ''bear''); whereas homonymy is a mere linguistic coincidence, polysemy is not. In discerning whether a given set of meanings represent polysemy or homonymy, it is often necessary to look at the history of the word to see whether the two meanings are historically related. Dictionary writers often list polysemes (words or phrases with different, but related, senses) in the same entry (that is, under the same headword) and enter homonyms as separate headwords (usually with a numbering convention such as ''¹bear'' and ''²bear''). Polysemes A polyseme is a word or phrase wit ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Synonymy A synonym is a word, morpheme, or phrase that means exactly or nearly the same as another word, morpheme, or phrase in a given language. For example, in the English language, the words ''begin'', ''start'', ''commence'', and ''initiate'' are all synonyms of one another: they are ''synonymous''. The standard test for synonymy is substitution: one form can be replaced by another in a sentence without changing its meaning. Words are considered synonymous in only one particular sense: for example, ''long'' and ''extended'' in the context ''long time'' or ''extended time'' are synonymous, but ''long'' cannot be used in the phrase ''extended family''. Synonyms with exactly the same meaning share a seme or denotational sememe, whereas those with inexactly similar meanings share a broader denotational or connotational sememe and thus overlap within a semantic field. The former are sometimes called cognitive synonyms and the latter, near-synonyms, plesionyms or poecilonyms. Lexicography ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]