Principal Component Regression

	Principal Component Regression In statistics, principal component regression (PCR) is a regression analysis technique that is based on principal component analysis (PCA). More specifically, PCR is used for estimating the unknown regression coefficients in a standard linear regression model. In PCR, instead of regressing the dependent variable on the explanatory variables directly, the principal components of the explanatory variables are used as regressors. One typically uses only a subset of all the principal components for regression, making PCR a kind of regularized procedure and also a type of shrinkage estimator. Often the principal components with higher variances (the ones based on eigenvectors corresponding to the higher eigenvalues of the sample variance-covariance matrix of the explanatory variables) are selected as regressors. However, for the purpose of predicting the outcome, the principal components with low variances may also be important, in some cases even more important. One major use o ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Statistics Statistics (from German language, German: ''wikt:Statistik#German, Statistik'', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, industrial, or social problem, it is conventional to begin with a statistical population or a statistical model to be studied. Populations can be diverse groups of people or objects such as "all people living in a country" or "every atom composing a crystal". Statistics deals with every aspect of data, including the planning of data collection in terms of the design of statistical survey, surveys and experimental design, experiments.Dodge, Y. (2006) ''The Oxford Dictionary of Statistical Terms'', Oxford University Press. When census data cannot be collected, statisticians collect data by developing specific experiment designs and survey sample (statistics), samples. Representative sampling as ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	High-dimensional Statistics In statistical theory, the field of high-dimensional statistics studies data whose dimension is larger than typically considered in classical multivariate analysis. The area arose owing to the emergence of many modern data sets in which the dimension of the data vectors may be comparable to, or even larger than, the sample size, so that justification for the use of traditional techniques, often based on asymptotic arguments with the dimension held fixed as the sample size increased, was lacking. Examples Parameter estimation in linear models The most basic statistical model for the relationship between a covariate vector x \in \mathbb^p and a response variable y \in \mathbb is the linear model : y = x^\top \beta + \epsilon, where \beta \in \mathbb^p is an unknown parameter vector, and \epsilon is random noise with mean zero and variance \sigma^2. Given independent responses Y_1,\ldots,Y_n, with corresponding covariates x_1,\ldots,x_n, from this model, we can form the response ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Orthonormality In linear algebra, two vectors in an inner product space are orthonormal if they are orthogonal (or perpendicular along a line) unit vectors. A set of vectors form an orthonormal set if all vectors in the set are mutually orthogonal and all of unit length. An orthonormal set which forms a basis is called an orthonormal basis. Intuitive overview The construction of orthogonality of vectors is motivated by a desire to extend the intuitive notion of perpendicular vectors to higher-dimensional spaces. In the Cartesian plane, two vectors are said to be ''perpendicular'' if the angle between them is 90° (i.e. if they form a right angle). This definition can be formalized in Cartesian space by defining the dot product and specifying that two vectors in the plane are orthogonal if their dot product is zero. Similarly, the construction of the norm of a vector is motivated by a desire to extend the intuitive notion of the length of a vector to higher-dimensional spaces. In Cartesian s ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Column Vector In linear algebra, a column vector with m elements is an m \times 1 matrix consisting of a single column of m entries, for example, \boldsymbol = \begin x_1 \\ x_2 \\ \vdots \\ x_m \end. Similarly, a row vector is a 1 \times n matrix for some n, consisting of a single row of n entries, \boldsymbol a = \begin a_1 & a_2 & \dots & a_n \end. (Throughout this article, boldface is used for both row and column vectors.) The transpose (indicated by T) of any row vector is a column vector, and the transpose of any column vector is a row vector: \begin x_1 \; x_2 \; \dots \; x_m \end^ = \begin x_1 \\ x_2 \\ \vdots \\ x_m \end and \begin x_1 \\ x_2 \\ \vdots \\ x_m \end^ = \begin x_1 \; x_2 \; \dots \; x_m \end. The set of all row vectors with ''n'' entries in a given field (such as the real numbers) forms an ''n''-dimensional vector space; similarly, the set of all column vectors with ''m'' entries forms an ''m''-dimensional vector space. The space of row vectors with ''n'' entries can b ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Singular Value Decomposition In linear algebra, the singular value decomposition (SVD) is a factorization of a real or complex matrix. It generalizes the eigendecomposition of a square normal matrix with an orthonormal eigenbasis to any \ m \times n\ matrix. It is related to the polar decomposition. Specifically, the singular value decomposition of an \ m \times n\ complex matrix is a factorization of the form \ \mathbf = \mathbf\ , where is an \ m \times m\ complex unitary matrix, \ \mathbf\ is an \ m \times n\ rectangular diagonal matrix with non-negative real numbers on the diagonal, is an n \times n complex unitary matrix, and \ \mathbf\ is the conjugate transpose of . Such decomposition always exists for any complex matrix. If is real, then and can be guaranteed to be real orthogonal matrices; in such contexts, the SVD is often denoted \ \mathbf^\mathsf\ . The diagonal entries \ \sigma_i = \Sigma_\ of \ \mathbf\ are uniquely determined by and are known as the singular values of . The n ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Bias Of An Estimator In statistics, the bias of an estimator (or bias function) is the difference between this estimator's expected value and the true value of the parameter being estimated. An estimator or decision rule with zero bias is called ''unbiased''. In statistics, "bias" is an property of an estimator. Bias is a distinct concept from consistency: consistent estimators converge in probability to the true value of the parameter, but may be biased or unbiased; see bias versus consistency for more. All else being equal, an unbiased estimator is preferable to a biased estimator, although in practice, biased estimators (with generally small bias) are frequently used. When a biased estimator is used, bounds of the bias are calculated. A biased estimator may be used for various reasons: because an unbiased estimator does not exist without further assumptions about a population; because an estimator is difficult to compute (as in unbiased estimation of standard deviation); because a biased estimato ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Rank (linear Algebra) In linear algebra, the rank of a matrix is the dimension of the vector space generated (or spanned) by its columns. p. 48, § 1.16 This corresponds to the maximal number of linearly independent columns of . This, in turn, is identical to the dimension of the vector space spanned by its rows. Rank is thus a measure of the " nondegenerateness" of the system of linear equations and linear transformation encoded by . There are multiple equivalent definitions of rank. A matrix's rank is one of its most fundamental characteristics. The rank is commonly denoted by or ; sometimes the parentheses are not written, as in .Alternative notation includes \rho (\Phi) from and . Main definitions In this section, we give some definitions of the rank of a matrix. Many definitions are possible; see Alternative definitions for several of these. The column rank of is the dimension of the column space of , while the row rank of is the dimension of the row space of . A fundamental result in ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Estimator In statistics, an estimator is a rule for calculating an estimate of a given quantity based on observed data: thus the rule (the estimator), the quantity of interest (the estimand) and its result (the estimate) are distinguished. For example, the sample mean is a commonly used estimator of the population mean. There are point and interval estimators. The point estimators yield single-valued results. This is in contrast to an interval estimator, where the result would be a range of plausible values. "Single value" does not necessarily mean "single number", but includes vector valued or function valued estimators. ''Estimation theory'' is concerned with the properties of estimators; that is, with defining properties that can be used to compare different estimators (different rules for creating estimates) for the same quantity, based on the same data. Such properties can be used to determine the best rules to use under given circumstances. However, in robust statistics, statistica ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Gauss–Markov Theorem In statistics, the Gauss–Markov theorem (or simply Gauss theorem for some authors) states that the ordinary least squares (OLS) estimator has the lowest sampling variance within the class of linear unbiased estimators, if the errors in the linear regression model are uncorrelated, have equal variances and expectation value of zero. The errors do not need to be normal, nor do they need to be independent and identically distributed (only uncorrelated with mean zero and homoscedastic with finite variance). The requirement that the estimator be unbiased cannot be dropped, since biased estimators exist with lower variance. See, for example, the James–Stein estimator (which also drops linearity), ridge regression, or simply any degenerate estimator. The theorem was named after Carl Friedrich Gauss and Andrey Markov, although Gauss' work significantly predates Markov's. But while Gauss derived the result under the assumption of independence and normality, Markov reduced the assu ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Centering Matrix In mathematics and multivariate statistics, the centering matrixJohn I. Marden, ''Analyzing and Modeling Rank Data'', Chapman & Hall, 1995, , page 59. is a symmetric and idempotent matrix, which when multiplied with a vector has the same effect as subtracting the mean of the components of the vector from every component of that vector. Definition The centering matrix of size ''n'' is defined as the ''n''-by-''n'' matrix :C_n = I_n - \tfracJ_n where I_n\, is the identity matrix of size ''n'' and J_n is an ''n''-by-''n'' matrix of all 1's. For example :C_1 = \begin 0 \end , :C_2= \left \begin 1 & 0 \\ 0 & 1 \end \right- \frac\left \begin 1 & 1 \\ 1 & 1 \end \right = \left \begin \frac & -\frac \\ -\frac & \frac \end \right , :C_3 = \left \begin 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end \right- \frac\left \begin 1 & 1 & 1 \\ 1 & 1 & 1 \\ 1 & 1 & 1 \end \right = \left \begin \frac & -\frac & -\frac \\ -\frac & \frac & -\frac \\ -\frac & -\frac & \frac \end \right ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Sample (statistics) In statistics, quality assurance, and survey methodology, sampling is the selection of a subset (a statistical sample) of individuals from within a statistical population to estimate characteristics of the whole population. Statisticians attempt to collect samples that are representative of the population in question. Sampling has lower costs and faster data collection than measuring the entire population and can provide insights in cases where it is infeasible to measure an entire population. Each observation measures one or more properties (such as weight, location, colour or mass) of independent objects or individuals. In survey sampling, weights can be applied to the data to adjust for the sample design, particularly in stratified sampling. Results from probability theory and statistical theory are employed to guide the practice. In business and medical research, sampling is widely used for gathering information about a population. Acceptance sampling is used to determine ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Transformation Matrix In linear algebra, linear transformations can be represented by matrices. If T is a linear transformation mapping \mathbb^n to \mathbb^m and \mathbf x is a column vector with n entries, then T( \mathbf x ) = A \mathbf x for some m \times n matrix A, called the transformation matrix of T. Note that A has m rows and n columns, whereas the transformation T is from \mathbb^n to \mathbb^m. There are alternative expressions of transformation matrices involving row vectors that are preferred by some authors. Uses Matrices allow arbitrary linear transformations to be displayed in a consistent format, suitable for computation. This also allows transformations to be composed easily (by multiplying their matrices). Linear transformations are not the only ones that can be represented by matrices. Some transformations that are non-linear on an n-dimensional Euclidean space R''n'' can be represented as linear transformations on the ''n''+1-dimensional space R''n''+1. These include both aff ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]