Expected Mean Square
   HOME

TheInfoList



OR:

In
statistics Statistics (from German language, German: ''wikt:Statistik#German, Statistik'', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of ...
, expected mean squares (EMS) are the expected values of certain statistics arising in partitions of sums of squares in the
analysis of variance Analysis of variance (ANOVA) is a collection of statistical models and their associated estimation procedures (such as the "variation" among and between groups) used to analyze the differences among means. ANOVA was developed by the statisticia ...
(ANOVA). They can be used for ascertaining which statistic should appear in the denominator in an
F-test An ''F''-test is any statistical test in which the test statistic has an ''F''-distribution under the null hypothesis. It is most often used when comparing statistical models that have been fitted to a data set, in order to identify the model th ...
for testing a
null hypothesis In scientific research, the null hypothesis (often denoted ''H''0) is the claim that no difference or relationship exists between two sets of data or variables being analyzed. The null hypothesis is that any experimentally observed difference is d ...
that a particular effect is absent.


Definition

When the total corrected sum of squares in an ANOVA is partitioned into several components, each attributed to the effect of a particular predictor variable, each of the sums of squares in that partition is a random variable that has an
expected value In probability theory, the expected value (also called expectation, expectancy, mathematical expectation, mean, average, or first moment) is a generalization of the weighted average. Informally, the expected value is the arithmetic mean of a l ...
. That expected value divided by the corresponding number of degrees of freedom is the expected
mean square In mathematics and its applications, the mean square is normally defined as the arithmetic mean of the squares of a set of numbers or of a random variable. It may also be defined as the arithmetic mean of the squares of the '' deviations'' between ...
for that predictor variable.


Example

The following example is from ''Longitudinal Data Analysis'' by Donald Hedeker and Robert D. Gibbons. Donald Hedeker, Robert D. Gibbons. ''Longitudinal Data Analysis.'' Wiley Interscience. 2006. pp. 21–24 Each of ''s'' treatments (one of which may be a placebo) is administered to a sample of (capital) ''N'' randomly chosen patients, on whom certain measurements Y_ are observed at each of (lower-case) ''n'' specified times, for h=1,\ldots,s, \quad i=1,\ldots,N_h (thus the numbers of patients receiving different treatments may differ), and j=1,\ldots, n. We assume the sets of patients receiving different treatments are disjoint, so patients are
nested ''Nested'' is the seventh studio album by Bronx-born singer, songwriter and pianist Laura Nyro, released in 1978 on Columbia Records. Following on from her extensive tour to promote 1976's ''Smile'', which resulted in the 1977 live album ''Season ...
within treatments and not crossed with treatments. We have : Y_ = \mu + \gamma_h + \tau_j + (\gamma\tau)_ + \pi_ + \varepsilon_ where *\mu = grand mean, (fixed) *\gamma_h = effect of treatment h, (fixed) *\tau_j = effect of time j, (fixed) *(\gamma\tau)_ = interaction effect of treatment h and time j, (fixed) *\pi_ = individual difference effect for patient i nested within treatment h, (random) *\varepsilon_ = error for patient i in treatment h at time j. (random) *\sigma_\pi^2 = variance of the random effect of patients nested within treatments, *\sigma_\varepsilon = error variance. The total corrected sum of squares is : \sum_ (Y_ - \overline Y)^2 \quad\text \overline Y = \frac 1 n \sum_ Y_. The ANOVA table below partitions the sum of squares (where N = \sum_h N_h ): :


Use in F-tests

A null hypothesis of interest is that there is no difference between effects of different treatments—thus no difference among treatment means. This may be expressed by saying D_\text=0, (with the notation as used in the table above). Under this null hypothesis, the expected mean square for effects of treatments is \sigma_\varepsilon^2 + n \sigma_\pi^2. The numerator in the F-statistic for testing this hypothesis is the mean square due to differences among treatments, i.e. it is \left. \text_\text\right/ \big( (N-s)(n-1) \big). The reason is that the random variable below, although under the null hypothesis it has an
F-distribution In probability theory and statistics, the ''F''-distribution or F-ratio, also known as Snedecor's ''F'' distribution or the Fisher–Snedecor distribution (after Ronald Fisher and George W. Snedecor) is a continuous probability distribution th ...
, is not observable—it is not a statistic—because its value depends on the unobservable parameters \sigma_\pi^2 and \sigma_\varepsilon^2. : \frac \ne \frac Instead, one uses as the test statistic the following random variable that is not defined in terms of \text_\text: : F = \frac = \frac


Notes and references

{{reflist Analysis of variance