High-dimensional Statistics

picture info	High-dimensional Statistics In statistical theory, the field of high-dimensional statistics studies data whose dimension is larger (relative to the number of datapoints) than typically considered in classical multivariate analysis. The area arose owing to the emergence of many modern data sets in which the dimension of the data vectors may be comparable to, or even larger than, the sample size, so that justification for the use of traditional techniques, often based on asymptotic arguments with the dimension held fixed as the sample size increased, was lacking. There are several notions of high-dimensional analysis of statistical methods including: * Non-asymptotic results which apply for finite n,p (number of data points and dimension size, respectively). * Kolmogorov asymptotics which studies the asymptotic behavior where the ratio n/p is converges to a specific finite value. Examples Parameter estimation in linear models The most basic statistical model for the relationship between a covariate vector ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Statistical Theory The theory of statistics provides a basis for the whole range of techniques, in both study design and data analysis, that are used within applications of statistics. The theory covers approaches to statistical-decision problems and to statistical inference, and the actions and deductions that satisfy the basic principles stated for these different approaches. Within a given approach, statistical theory gives ways of comparing statistical procedures; it can find the best possible procedure within a given context for given statistical problems, or can provide guidance on the choice between alternative procedures. Apart from philosophical considerations about how to make statistical inferences and decisions, much of statistical theory consists of mathematical statistics, and is closely linked to probability theory, to utility theory, and to optimization. Scope Statistical theory provides an underlying rationale and provides a consistent basis for the choice of methodology used i ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Eigenvalue In linear algebra, an eigenvector ( ) or characteristic vector is a vector that has its direction unchanged (or reversed) by a given linear transformation. More precisely, an eigenvector \mathbf v of a linear transformation T is scaled by a constant factor \lambda when the linear transformation is applied to it: T\mathbf v=\lambda \mathbf v. The corresponding eigenvalue, characteristic value, or characteristic root is the multiplying factor \lambda (possibly a negative or complex number). Geometrically, vectors are multi-dimensional quantities with magnitude and direction, often pictured as arrows. A linear transformation rotates, stretches, or shears the vectors upon which it acts. A linear transformation's eigenvectors are those vectors that are only stretched or shrunk, with neither rotation nor shear. The corresponding eigenvalue is the factor by which an eigenvector is stretched or shrunk. If the eigenvalue is negative, the eigenvector's direction is reversed. Th ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Band Matrix In mathematics, particularly matrix theory, a band matrix or banded matrix is a sparse matrix whose non-zero entries are confined to a diagonal ''band'', comprising the main diagonal and zero or more diagonals on either side. Band matrix Bandwidth Formally, consider an ''n''×''n'' matrix ''A''=(''a''''i,j'' ). If all matrix elements are zero outside a diagonally bordered band whose range is determined by constants ''k''1 and ''k''2: :a_=0 \quad\mbox\quad ji+k_2; \quad k_1, k_2 \ge 0.\, then the quantities ''k''1 and ''k''2 are called the and , respectively. The of the matrix is the maximum of ''k''1 and ''k''2; in other words, it is the number ''k'' such that a_=0 if , i-j, > k . Examples A band matrix with ''k''1 = ''k''2 = 0 is a diagonal matrix, with bandwidth 0. A band matrix with ''k''1 = ''k''2 = 1 is a tridiagonal matrix, with bandwidth 1. For ''k''1 = ''k''2 = 2 one has a pentadiagonal matrix and so on. Triangular matrices For ''k''1 = 0, ''k''2 = ' ... [...More Info...] [...Related Items...] OR:** [Wikipedia] [Google] [Baidu]
	Rank (linear Algebra) In linear algebra, the rank of a matrix is the dimension of the vector space generated (or spanned) by its columns. p. 48, § 1.16 This corresponds to the maximal number of linearly independent columns of . This, in turn, is identical to the dimension of the vector space spanned by its rows. Rank is thus a measure of the " nondegenerateness" of the system of linear equations and linear transformation encoded by . There are multiple equivalent definitions of rank. A matrix's rank is one of its most fundamental characteristics. The rank is commonly denoted by or ; sometimes the parentheses are not written, as in .Alternative notation includes \rho (\Phi) from and . Main definitions In this section, we give some definitions of the rank of a matrix. Many definitions are possible; see Alternative definitions for several of these. The column rank of is the dimension of the column space of , while the row rank of is the dimension of the row space of . A fundamental resul ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Matrix Norm In the field of mathematics, norms are defined for elements within a vector space. Specifically, when the vector space comprises matrices, such norms are referred to as matrix norms. Matrix norms differ from vector norms in that they must also interact with matrix multiplication. Preliminaries Given a field \ K\ of either real or complex numbers (or any complete subset thereof), let \ K^\ be the -vector space of matrices with m rows and n columns and entries in the field \ K ~. A matrix norm is a norm on \ K^~. Norms are often expressed with double vertical bars (like so: \ \, A\, \ ). Thus, the matrix norm is a function \ \, \cdot\, : K^ \to \R^\ that must satisfy the following properties: For all scalars \ \alpha \in K\ and matrices \ A, B \in K^\ , * \, A\, \ge 0\ (''positive-valued'') * \, A\, = 0 \iff A=0_ (''definite'') * \left\, \alpha\ A \right\, = \left, \alpha \\ \left\, A\right\, \ (''absolutely homogeneous'') * \, A + B \, \le \, A \, + \, ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Consistent Estimator In statistics, a consistent estimator or asymptotically consistent estimator is an estimator—a rule for computing estimates of a parameter ''θ''0—having the property that as the number of data points used increases indefinitely, the resulting sequence of estimates converges in probability to ''θ''0. This means that the distributions of the estimates become more and more concentrated near the true value of the parameter being estimated, so that the probability of the estimator being arbitrarily close to ''θ''0 converges to one. In practice one constructs an estimator as a function of an available sample of size ''n'', and then imagines being able to keep collecting data and expanding the sample ''ad infinitum''. In this way one would obtain a sequence of estimates indexed by ''n'', and consistency is a property of what occurs as the sample size “grows to infinity”. If the sequence of estimates can be mathematically shown to converge in probability to the true value '' ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Sample Covariance Matrix The sample mean (sample average) or empirical mean (empirical average), and the sample covariance or empirical covariance are statistics computed from a sample of data on one or more random variables. The sample mean is the average value (or mean value) of a sample of numbers taken from a larger population of numbers, where "population" indicates not number of people but the entirety of relevant data, whether collected or not. A sample of 40 companies' sales from the Fortune 500 might be used for convenience instead of looking at the population, all 500 companies' sales. The sample mean is used as an estimator for the population mean, the average value in the entire population, where the estimate is more likely to be close to the population mean if the sample is large and representative. The reliability of the sample mean is estimated using the standard error, which in turn is calculated using the variance of the sample. If the sample is random, the standard error falls with the ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Estimation Of Covariance Matrices In statistics, sometimes the covariance matrix of a multivariate random variable is not known but has to be estimation theory, estimated. Estimation of covariance matrices then deals with the question of how to approximate the actual covariance matrix on the basis of a sample from the Joint probability distribution, multivariate distribution. Simple cases, where observations are complete, can be dealt with by using the sample covariance matrix. The sample covariance matrix (SCM) is an unbiased estimator, unbiased and Efficiency (statistics), efficient estimator of the covariance matrix if the space of covariance matrices is viewed as an Differential geometry#Intrinsic versus extrinsic, extrinsic convex cone in R''p''×''p''; however, measured using the Symmetric space, intrinsic geometry of Positive-definite matrix, positive-definite matrices, the SCM is a Biased estimator, biased and inefficient estimator. In addition, if the random variable has a normal distribution, the sample co ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Lasso (statistics) In statistics and machine learning, lasso (least absolute shrinkage and selection operator; also Lasso, LASSO or L1 regularization) is a regression analysis method that performs both variable selection and Regularization (mathematics), regularization in order to enhance the prediction accuracy and interpretability of the resulting statistical model. The lasso method assumes that the coefficients of the linear model are sparse, meaning that few of them are non-zero. It was originally introduced in geophysics, and later by Robert Tibshirani, who coined the term. Lasso was originally formulated for linear regression models. This simple case reveals a substantial amount about the estimator. These include its relationship to ridge regression and best subset selection and the connections between lasso coefficient estimates and so-called soft thresholding. It also reveals that (like standard linear regression) the coefficient estimates do not need to be unique if covariates are collinear ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Sparse Matrix In numerical analysis and scientific computing, a sparse matrix or sparse array is a matrix in which most of the elements are zero. There is no strict definition regarding the proportion of zero-value elements for a matrix to qualify as sparse but a common criterion is that the number of non-zero elements is roughly equal to the number of rows or columns. By contrast, if most of the elements are non-zero, the matrix is considered dense. The number of zero-valued elements divided by the total number of elements (e.g., ''m'' × ''n'' for an ''m'' × ''n'' matrix) is sometimes referred to as the sparsity of the matrix. Conceptually, sparsity corresponds to systems with few pairwise interactions. For example, consider a line of balls connected by springs from one to the next: this is a sparse system, as only adjacent balls are coupled. By contrast, if the same line of balls were to have springs connecting each ball to all other balls, the system would correspond to a dense matrix. ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]