Mixed Model
   HOME

TheInfoList



OR:

A mixed model, mixed-effects model or mixed error-component model is a
statistical model A statistical model is a mathematical model that embodies a set of statistical assumptions concerning the generation of Sample (statistics), sample data (and similar data from a larger Statistical population, population). A statistical model repres ...
containing both fixed effects and
random effect In statistics, a random effects model, also called a variance components model, is a statistical model where the model parameters are random variables. It is a kind of hierarchical linear model, which assumes that the data being analysed are ...
s. These models are useful in a wide variety of disciplines in the physical, biological and social sciences. They are particularly useful in settings where repeated measurements are made on the same
statistical unit In statistics, a unit is one member of a set of entities being studied. It is the main source for the mathematical abstraction of a "random variable". Common examples of a unit would be a single person, animal, plant, manufactured item, or country ...
s (
longitudinal study A longitudinal study (or longitudinal survey, or panel study) is a research design that involves repeated observations of the same variables (e.g., people) over short or long periods of time (i.e., uses longitudinal data). It is often a type of obs ...
), or where measurements are made on clusters of related statistical units. Because of their advantage in dealing with missing values, mixed effects models are often preferred over more traditional approaches such as repeated measures
analysis of variance Analysis of variance (ANOVA) is a collection of statistical models and their associated estimation procedures (such as the "variation" among and between groups) used to analyze the differences among means. ANOVA was developed by the statisticia ...
. This page will discuss mainly linear mixed-effects models (LMEM) rather than generalized linear mixed models or nonlinear mixed-effects models.


History and current status

Ronald Fisher Sir Ronald Aylmer Fisher (17 February 1890 – 29 July 1962) was a British polymath who was active as a mathematician, statistician, biologist, geneticist, and academic. For his work in statistics, he has been described as "a genius who a ...
introduced
random effects model In statistics, a random effects model, also called a variance components model, is a statistical model where the model parameters are random variables. It is a kind of hierarchical linear model, which assumes that the data being analysed are dra ...
s to study the correlations of trait values between relatives. In the 1950s,
Charles Roy Henderson Charles Roy Henderson ( – ) was an American statistician and a pioneer in animal breeding — the application of quantitative methods for the genetic evaluation of domestic livestock. This is critically important because it allows farmers and ...
provided best linear unbiased estimates of
fixed effects In statistics, a fixed effects model is a statistical model in which the model parameters are fixed or non-random quantities. This is in contrast to random effects models and mixed models in which all or some of the model parameters are random va ...
and
best linear unbiased prediction In statistics, best linear unbiased prediction (BLUP) is used in linear mixed models for the estimation of random effects. BLUP was derived by Charles Roy Henderson in 1950 but the term "best linear unbiased predictor" (or "prediction") seems not ...
s of random effects. Subsequently, mixed modeling has become a major area of statistical research, including work on computation of maximum likelihood estimates, non-linear mixed effects models, missing data in mixed effects models, and
Bayesian Thomas Bayes (/beɪz/; c. 1701 – 1761) was an English statistician, philosopher, and Presbyterian minister. Bayesian () refers either to a range of concepts and approaches that relate to statistical methods based on Bayes' theorem, or a followe ...
estimation of mixed effects models. Mixed models are applied in many disciplines where multiple correlated measurements are made on each unit of interest. They are prominently used in research involving human and animal subjects in fields ranging from genetics to marketing, and have also been used in baseball and industrial statistics.


Definition

In
matrix notation In mathematics, a matrix (plural matrices) is a rectangular array or table of numbers, symbols, or expressions, arranged in rows and columns, which is used to represent a mathematical object or a property of such an object. For example, \begin ...
a linear mixed model can be represented as :\boldsymbol = X \boldsymbol + Z \boldsymbol + \boldsymbol where *\boldsymbol is a known vector of observations, with mean E(\boldsymbol) = X \boldsymbol; *\boldsymbol is an unknown vector of fixed effects; *\boldsymbol is an unknown vector of random effects, with mean E(\boldsymbol)=\boldsymbol and
variance–covariance matrix In probability theory and statistics, a covariance matrix (also known as auto-covariance matrix, dispersion matrix, variance matrix, or variance–covariance matrix) is a square Matrix (mathematics), matrix giving the covariance between ea ...
\operatorname(\boldsymbol)=G; *\boldsymbol is an unknown vector of random errors, with mean E(\boldsymbol)=\boldsymbol and variance \operatorname(\boldsymbol)=R; *X and Z are known design matrices relating the observations \boldsymbol to \boldsymbol and \boldsymbol, respectively.


Estimation

The joint density of \boldsymbol and \boldsymbol can be written as: f(\boldsymbol,\boldsymbol) = f(\boldsymbol , \boldsymbol) \, f(\boldsymbol). Assuming normality, \boldsymbol \sim \mathcal(\boldsymbol,G), \boldsymbol \sim \mathcal(\boldsymbol,R) and \mathrm(\boldsymbol,\boldsymbol)=\boldsymbol, and maximizing the joint density over \boldsymbol and \boldsymbol, gives Henderson's "mixed model equations" (MME) for linear mixed models: : \begin X'R^X & X'R^Z \\ Z'R^X & Z'R^Z + G^ \end \begin \hat \\ \hat \end = \begin X'R^\boldsymbol \\ Z'R^\boldsymbol \end The solutions to the MME, \textstyle\hat and \textstyle\hat are best linear unbiased estimates and predictors for \boldsymbol and \boldsymbol, respectively. This is a consequence of the
Gauss–Markov theorem In statistics, the Gauss–Markov theorem (or simply Gauss theorem for some authors) states that the ordinary least squares (OLS) estimator has the lowest sampling variance within the class of linear unbiased estimators, if the errors in the ...
when the
conditional variance In probability theory and statistics, a conditional variance is the variance of a random variable given the value(s) of one or more other variables. Particularly in econometrics, the conditional variance is also known as the scedastic function or ...
of the outcome is not scalable to the identity matrix. When the conditional variance is known, then the inverse variance weighted least squares estimate is best linear unbiased estimates. However, the conditional variance is rarely, if ever, known. So it is desirable to jointly estimate the variance and weighted parameter estimates when solving MMEs. One method used to fit such mixed models is that of the
expectation–maximization algorithm In statistics, an expectation–maximization (EM) algorithm is an iterative method to find (local) maximum likelihood or maximum a posteriori (MAP) estimates of parameters in statistical models, where the model depends on unobserved latent variabl ...
(EM) where the variance components are treated as unobserved
nuisance parameter Nuisance (from archaic ''nocence'', through Fr. ''noisance'', ''nuisance'', from Lat. ''nocere'', "to hurt") is a common law tort. It means that which causes offence, annoyance, trouble or injury. A nuisance can be either public (also "common") ...
s in the joint likelihood. Currently, this is the method implemented in statistical software such as
Python Python may refer to: Snakes * Pythonidae, a family of nonvenomous snakes found in Africa, Asia, and Australia ** ''Python'' (genus), a genus of Pythonidae found in Africa and Asia * Python (mythology), a mythical serpent Computing * Python (pro ...
(
statsmodels Statsmodels is a Python package that allows users to explore data, estimate statistical models, and perform statistical tests. An extensive list of descriptive statistics, statistical tests, plotting functions, and result statistics are available f ...
package) and SAS (proc mixed), and as initial step only in R's nlme package lme(). The solution to the mixed model equations is a
maximum likelihood estimate In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of an assumed probability distribution, given some observed data. This is achieved by maximizing a likelihood function so that, under the assumed statisti ...
when the distribution of the errors is normal. There are several other methods to fit mixed models, including using an EM initially, and then Newton-Raphson (used by R package nlme's lme()), penalized least squares to get a profiled log likelihood only depending on the (low-dimensional) variance-covariance parameters of \boldsymbol, i.e., its cov matrix \boldsymbol, and then modern direct optimization for that reduced objective function (used by R's lme4 package lmer() and the
Julia Julia is usually a feminine given name. It is a Latinate feminine form of the name Julio and Julius. (For further details on etymology, see the Wiktionary entry "Julius".) The given name ''Julia'' had been in use throughout Late Antiquity (e.g ...
package MixedModels.jl) and direct optimization of the likelihood (used by e.g. R's glmmTMB). Notably, while the canonical form proposed by Henderson is useful for theory, many popular software packages use a different formulation for numerical computation in order to take advantage of sparse matrix methods (e.g. lme4 and MixedModels.jl).


See also

*
Nonlinear mixed-effects model Nonlinear mixed-effects models constitute a class of statistical models generalizing mixed model, linear mixed-effects models. Like linear mixed-effects models, they are particularly useful in settings where there are multiple measurements within t ...
*
Fixed effects model In statistics, a fixed effects model is a statistical model in which the model parameters are fixed or non-random quantities. This is in contrast to random effects models and mixed models in which all or some of the model parameters are random ...
*
Generalized linear mixed model In statistics, a generalized linear mixed model (GLMM) is an extension to the generalized linear model (GLM) in which the linear predictor contains random effects in addition to the usual fixed effects. They also inherit from GLMs the idea of exte ...
*
Linear regression In statistics, linear regression is a linear approach for modelling the relationship between a scalar response and one or more explanatory variables (also known as dependent and independent variables). The case of one explanatory variable is call ...
*
Mixed-design analysis of variance In statistics, a mixed-design analysis of variance model, also known as a split-plot ANOVA, is used to test for differences between two or more independent groups whilst subjecting participants to repeated measures. Thus, in a mixed-design ANOVA mo ...
*
Multilevel model Multilevel models (also known as hierarchical linear models, linear mixed-effect model, mixed models, nested data models, random coefficient, random-effects models, random parameter models, or split-plot designs) are statistical models of parame ...
*
Random effects model In statistics, a random effects model, also called a variance components model, is a statistical model where the model parameters are random variables. It is a kind of hierarchical linear model, which assumes that the data being analysed are dra ...
*
Repeated measures design Repeated measures design is a research design that involves multiple measures of the same variable taken on the same or matched subjects either under different conditions or over two or more time periods. For instance, repeated measurements are c ...
*
Empirical Bayes method Empirical Bayes methods are procedures for statistical inference in which the prior probability distribution is estimated from the data. This approach stands in contrast to standard Bayesian methods, for which the prior distribution is fixed be ...


References


Further reading

* * * {{DEFAULTSORT:Mixed Model Regression models Analysis of variance