HOME

TheInfoList



OR:

The general linear model or general multivariate regression model is a compact way of simultaneously writing several
multiple linear regression In statistics, linear regression is a linear approach for modelling the relationship between a scalar response and one or more explanatory variables (also known as dependent and independent variables). The case of one explanatory variable is cal ...
models. In that sense it is not a separate statistical
linear model In statistics, the term linear model is used in different ways according to the context. The most common occurrence is in connection with regression models and the term is often taken as synonymous with linear regression model. However, the term ...
. The various multiple linear regression models may be compactly written as : \mathbf = \mathbf\mathbf + \mathbf, where Y is a
matrix Matrix most commonly refers to: * ''The Matrix'' (franchise), an American media franchise ** ''The Matrix'', a 1999 science-fiction action film ** "The Matrix", a fictional setting, a virtual reality environment, within ''The Matrix'' (franchis ...
with series of multivariate measurements (each column being a set of measurements on one of the
dependent variable Dependent and independent variables are variables in mathematical modeling, statistical modeling and experimental sciences. Dependent variables receive this name because, in an experiment, their values are studied under the supposition or demand ...
s), X is a matrix of observations on
independent variable Dependent and independent variables are variables in mathematical modeling, statistical modeling and experimental sciences. Dependent variables receive this name because, in an experiment, their values are studied under the supposition or demand ...
s that might be a
design matrix In statistics and in particular in regression analysis, a design matrix, also known as model matrix or regressor matrix and often denoted by X, is a matrix of values of explanatory variables of a set of objects. Each row represents an individual ob ...
(each column being a set of observations on one of the independent variables), B is a matrix containing parameters that are usually to be estimated and U is a matrix containing errors (noise). The errors are usually assumed to be uncorrelated across measurements, and follow a
multivariate normal distribution In probability theory and statistics, the multivariate normal distribution, multivariate Gaussian distribution, or joint normal distribution is a generalization of the one-dimensional (univariate) normal distribution to higher dimensions. One d ...
. If the errors do not follow a multivariate normal distribution,
generalized linear model In statistics, a generalized linear model (GLM) is a flexible generalization of ordinary linear regression. The GLM generalizes linear regression by allowing the linear model to be related to the response variable via a ''link function'' and b ...
s may be used to relax assumptions about Y and U. The general linear model incorporates a number of different statistical models:
ANOVA Analysis of variance (ANOVA) is a collection of statistical models and their associated estimation procedures (such as the "variation" among and between groups) used to analyze the differences among means. ANOVA was developed by the statistician ...
,
ANCOVA Analysis of covariance (ANCOVA) is a general linear model which blends ANOVA and regression. ANCOVA evaluates whether the means of a dependent variable (DV) are equal across levels of a categorical independent variable (IV) often called a tre ...
,
MANOVA In statistics, multivariate analysis of variance (MANOVA) is a procedure for comparing multivariate sample means. As a multivariate procedure, it is used when there are two or more dependent variables, and is often followed by significance tests i ...
,
MANCOVA Multivariate analysis of covariance (MANCOVA) is an extension of analysis of covariance (ANCOVA) methods to cover cases where there is more than one dependent variable and where the control of concomitant continuous independent variables – covaria ...
, ordinary
linear regression In statistics, linear regression is a linear approach for modelling the relationship between a scalar response and one or more explanatory variables (also known as dependent and independent variables). The case of one explanatory variable is call ...
, ''t''-test and ''F''-test. The general linear model is a generalization of multiple linear regression to the case of more than one dependent variable. If Y, B, and U were
column vector In linear algebra, a column vector with m elements is an m \times 1 matrix consisting of a single column of m entries, for example, \boldsymbol = \begin x_1 \\ x_2 \\ \vdots \\ x_m \end. Similarly, a row vector is a 1 \times n matrix for some n, c ...
s, the matrix equation above would represent multiple linear regression. Hypothesis tests with the general linear model can be made in two ways:
multivariate Multivariate may refer to: In mathematics * Multivariable calculus * Multivariate function * Multivariate polynomial In computing * Multivariate cryptography * Multivariate division algorithm * Multivariate interpolation * Multivariate optical c ...
or as several independent
univariate In mathematics, a univariate object is an expression, equation, function or polynomial involving only one variable. Objects involving more than one variable are multivariate. In some cases the distinction between the univariate and multivariate cas ...
tests. In multivariate tests the columns of Y are tested together, whereas in univariate tests the columns of Y are tested independently, i.e., as multiple univariate tests with the same design matrix.


Comparison to multiple linear regression

Multiple linear regression is a generalization of
simple linear regression In statistics, simple linear regression is a linear regression model with a single explanatory variable. That is, it concerns two-dimensional sample points with one independent variable and one dependent variable (conventionally, the ''x'' and ...
to the case of more than one independent variable, and a
special case In logic, especially as applied in mathematics, concept is a special case or specialization of concept precisely if every instance of is also an instance of but not vice versa, or equivalently, if is a generalization of . A limiting case is ...
of general linear models, restricted to one dependent variable. The basic model for multiple linear regression is : Y_i = \beta_0 + \beta_1 X_ + \beta_2 X_ + \ldots + \beta_p X_ + \epsilon_i or more compactly Y_i = \beta_0 + \sum \limits_^ + \epsilon_i for each observation ''i'' = 1, ... , ''n''. In the formula above we consider ''n'' observations of one dependent variable and ''p'' independent variables. Thus, ''Y''''i'' is the ''i''th observation of the dependent variable, ''X''''ij'' is ''i''th observation of the ''j''th independent variable, ''j'' = 1, 2, ..., ''p''. The values ''β''''j'' represent parameters to be estimated, and ''ε''''i'' is the ''i''th independent identically distributed normal error. In the more general multivariate linear regression, there is one equation of the above form for each of ''m'' > 1 dependent variables that share the same set of explanatory variables and hence are estimated simultaneously with each other: : Y_ = \beta_ + \beta_ X_ + \beta_X_ + \ldots + \beta_ X_ + \epsilon_ or more compactly Y_ = \beta_ + \sum \limits_^ + \epsilon_ for all observations indexed as ''i'' = 1, ... , ''n'' and for all dependent variables indexed as ''j = 1, ... , ''m''. Note that, since each dependent variable has its own set of regression parameters to be fitted, from a computational point of view the general multivariate regression is simply a sequence of standard multiple linear regressions using the same explanatory variables.


Comparison to generalized linear model

The general linear model and the generalized linear model (GLM) are two commonly used families of
statistical methods Statistics (from German: ''Statistik'', "description of a state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, industria ...
to relate some number of continuous and/or categorical predictors to a single outcome variable. The main difference between the two approaches is that the general linear model strictly assumes that the residuals will follow a conditionally
normal distribution In statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is : f(x) = \frac e^ The parameter \mu ...
,Cohen, J., Cohen, P., West, S. G., & Aiken, L. S. (2003). Applied multiple regression/correlation analysis for the behavioral sciences. while the GLM loosens this assumption and allows for a variety of other distributions from the
exponential family In probability and statistics, an exponential family is a parametric set of probability distributions of a certain form, specified below. This special form is chosen for mathematical convenience, including the enabling of the user to calculate ...
for the residuals. Of note, the general linear model is a special case of the GLM in which the distribution of the residuals follow a conditionally normal distribution. The distribution of the residuals largely depends on the type and distribution of the outcome variable; different types of outcome variables lead to the variety of models within the GLM family. Commonly used models in the GLM family include binary logistic regression for binary or dichotomous outcomes,
Poisson regression In statistics, Poisson regression is a generalized linear model form of regression analysis used to model count data and contingency tables. Poisson regression assumes the response variable ''Y'' has a Poisson distribution, and assumes the logari ...
for count outcomes, and
linear regression In statistics, linear regression is a linear approach for modelling the relationship between a scalar response and one or more explanatory variables (also known as dependent and independent variables). The case of one explanatory variable is call ...
for continuous, normally distributed outcomes. This means that GLM may be spoken of as a general family of statistical models or as specific models for specific outcome types.


Applications

An application of the general linear model appears in the analysis of multiple
brain scan Neuroimaging is the use of quantitative (computational) techniques to study the structure and function of the central nervous system, developed as an objective way of scientifically studying the healthy human brain in a non-invasive manner. Incr ...
s in scientific experiments where contains data from brain scanners, contains experimental design variables and confounds. It is usually tested in a univariate way (usually referred to a ''mass-univariate'' in this setting) and is often referred to as statistical parametric mapping.


See also

*
Bayesian multivariate linear regression In statistics, Bayesian multivariate linear regression is a Bayesian approach to multivariate linear regression, i.e. linear regression where the predicted outcome is a vector of correlated random variables rather than a single scalar random v ...
*
F-test An ''F''-test is any statistical test in which the test statistic has an ''F''-distribution under the null hypothesis. It is most often used when comparing statistical models that have been fitted to a data set, in order to identify the model th ...
*
t-test A ''t''-test is any statistical hypothesis testing, statistical hypothesis test in which the test statistic follows a Student's t-distribution, Student's ''t''-distribution under the null hypothesis. It is most commonly applied when the test stati ...


Notes


References

* * * {{statistics Regression models