HOME

TheInfoList



OR:

In
regression Regression or regressions may refer to: Science * Marine regression, coastal advance due to falling sea level, the opposite of marine transgression * Regression (medicine), a characteristic of diseases to express lighter symptoms or less extent ( ...
, mean response (or expected response) and predicted response, also known as mean outcome (or expected outcome) and predicted outcome, are values of the
dependent variable Dependent and independent variables are variables in mathematical modeling, statistical modeling and experimental sciences. Dependent variables receive this name because, in an experiment, their values are studied under the supposition or demand ...
calculated from the regression parameters and a given value of the independent variable. The values of these two responses are the same, but their calculated
variance In probability theory and statistics, variance is the expectation of the squared deviation of a random variable from its population mean or sample mean. Variance is a measure of dispersion, meaning it is a measure of how far a set of numbers ...
s are different. The concept is a generalization of the distinction between the
standard error of the mean The standard error (SE) of a statistic (usually an estimate of a parameter) is the standard deviation of its sampling distribution or an estimate of that standard deviation. If the statistic is the sample mean, it is called the standard error of ...
and the
sample standard deviation In statistics, the standard deviation is a measure of the amount of variation or dispersion of a set of values. A low standard deviation indicates that the values tend to be close to the mean (also called the expected value) of the set, while ...
.


Background

In straight line fitting, the model is :y_i=\alpha+\beta x_i +\varepsilon_i\, where y_i is the response variable, x_i is the
explanatory variable Dependent and independent variables are variables in mathematical modeling, statistical modeling and experimental sciences. Dependent variables receive this name because, in an experiment, their values are studied under the supposition or demand ...
, ''εi'' is the random error, and \alpha and \beta are parameters. The mean, and predicted, response value for a given explanatory value, ''xd'', is given by :\hat_d=\hat\alpha+\hat\beta x_d , while the actual response would be :y_d=\alpha+\beta x_d +\varepsilon_d \, Expressions for the values and variances of \hat\alpha and \hat\beta are given in
linear regression In statistics, linear regression is a linear approach for modelling the relationship between a scalar response and one or more explanatory variables (also known as dependent and independent variables). The case of one explanatory variable is call ...
.


Variance of the mean response

Since the data in this context is defined to be (''x'', ''y'') pairs for every observation, the ''mean response'' at a given value of ''x'', say ''xd'', is an estimate of the mean of the ''y'' values in the population at the ''x'' value of ''xd'', that is \hat(y \mid x_d) \equiv\hat_d\!. The variance of the mean response is given by : \operatorname\left(\hat + \hatx_d\right) = \operatorname\left(\hat\right) + \left(\operatorname \hat\right)x_d^2 + 2 x_d \operatorname \left(\hat, \hat \right) . This expression can be simplified to :\operatorname\left(\hat + \hatx_d\right) =\sigma^2\left(\frac + \frac\right), where ''m'' is the number of data points. To demonstrate this simplification, one can make use of the identity : \sum (x_i - \bar)^2 = \sum x_i^2 - \frac 1 m \left(\sum x_i\right)^2 .


Variance of the predicted response

The ''predicted response'' distribution is the predicted distribution of the residuals at the given point ''xd''. So the variance is given by : \begin \operatorname\left(y_d - \left hat + \hat x_d \right\right) &= \operatorname (y_d) + \operatorname \left(\hat + \hatx_d\right) - 2\operatorname\left(y_d,\left hat + \hat x_d \rightright)\\ &= \operatorname (y_d) + \operatorname \left(\hat + \hatx_d\right). \end The second line follows from the fact that \operatorname\left(y_d,\left hat + \hat x_d \rightright) is zero because the new prediction point is independent of the data used to fit the model. Additionally, the term \operatorname \left(\hat + \hatx_d\right) was calculated earlier for the mean response. Since \operatorname(y_d)=\sigma^2 (a fixed but unknown parameter that can be estimated), the variance of the predicted response is given by : \begin \operatorname\left(y_d - \left hat + \hat x_d \right\right) & = \sigma^2 + \sigma^2\left(\frac 1 m + \frac\right)\\ pt& = \sigma^2\left(1 + \frac 1 m + \frac\right). \end


Confidence intervals

The 100(1-\alpha)\% confidence intervals are computed as y_d \pm t_ \sqrt . Thus, the confidence interval for predicted response is wider than the interval for mean response. This is expected intuitively – the variance of the population of y values does not shrink when one samples from it, because the random variable ''εi'' does not decrease, but the variance of the mean of the y does shrink with increased sampling, because the variance in \hat \alpha and \hat \beta decrease, so the mean response (predicted response value) becomes closer to \alpha + \beta x_d. This is analogous to the difference between the variance of a population and the variance of the sample mean of a population: the variance of a population is a parameter and does not change, but the variance of the sample mean decreases with increased samples.


General linear regression

The general linear model can be written as : y_i=\sum_^n X_\beta_j + \varepsilon_i\, Therefore, since y_d=\sum_^n X_\hat\beta_j the general expression for the variance of the mean response is : \operatorname\left(\sum_^n X_\hat\beta_j\right)= \sum_^n \sum_^n X_S_X_, where S is the
covariance matrix In probability theory and statistics, a covariance matrix (also known as auto-covariance matrix, dispersion matrix, variance matrix, or variance–covariance matrix) is a square matrix giving the covariance between each pair of elements of ...
of the parameters, given by :\mathbf=\sigma^2\left(\mathbf\right)^.


See also

*
Expected value In probability theory, the expected value (also called expectation, expectancy, mathematical expectation, mean, average, or first moment) is a generalization of the weighted average. Informally, the expected value is the arithmetic mean of a l ...
*
Prediction error In statistics the mean squared prediction error or mean squared error of the predictions of a smoothing or curve fitting procedure is the expected value of the squared difference between the fitted values implied by the predictive function \wideh ...
*
Regression prediction In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships between a dependent variable (often called the 'outcome' or 'response' variable, or a 'label' in machine learning parlance) and one ...


References

* {{DEFAULTSORT:Mean And Predicted Response Regression analysis