Endogeneity (econometrics)
   HOME

TheInfoList



OR:

In
econometrics Econometrics is the application of statistical methods to economic data in order to give empirical content to economic relationships. M. Hashem Pesaran (1987). "Econometrics," '' The New Palgrave: A Dictionary of Economics'', v. 2, p. 8 p. 8 ...
, endogeneity broadly refers to situations in which an
explanatory variable Dependent and independent variables are variables in mathematical modeling, statistical modeling and experimental sciences. Dependent variables receive this name because, in an experiment, their values are studied under the supposition or deman ...
is
correlated In statistics, correlation or dependence is any statistical relationship, whether causal or not, between two random variables or bivariate data. Although in the broadest sense, "correlation" may indicate any type of association, in statistic ...
with the
error term In mathematics and statistics, an error term is an additive type of error. Common examples include: * errors and residuals in statistics, e.g. in linear regression * the error term in numerical integration In analysis, numerical integration ...
. The distinction between
endogenous and exogenous variables In an economics, economic model (economics), model, an exogenous variable is one whose measure is determined outside the model and is imposed on the model, and an exogenous change is a change in an exogenous variable.Mankiw, N. Gregory. ''Macroeco ...
originated in
simultaneous equations model Simultaneous equations models are a type of statistical model in which the dependent variables are functions of other dependent variables, rather than just independent variables. This means some of the explanatory variables are jointly determined ...
s, where one separates variables whose values are determined by the
model A model is an informative representation of an object, person or system. The term originally denoted the plans of a building in late 16th-century English, and derived via French and Italian ultimately from Latin ''modulus'', a measure. Models c ...
from variables which are predetermined; ignoring
simultaneity Simultaneity may refer to: * Relativity of simultaneity, a concept in special relativity. * Simultaneity (music), more than one complete musical texture occurring at the same time, rather than in succession * Simultaneity, a concept in Endogeneit ...
in the estimation leads to biased estimates as it violates the exogeneity assumption of the
Gauss–Markov theorem In statistics, the Gauss–Markov theorem (or simply Gauss theorem for some authors) states that the ordinary least squares (OLS) estimator has the lowest sampling variance within the class of linear unbiased estimators, if the errors in the ...
. The problem of endogeneity is often ignored by researchers conducting non-experimental research and doing so precludes making policy recommendations.
Instrumental variable In statistics, econometrics, epidemiology and related disciplines, the method of instrumental variables (IV) is used to estimate causal relationships when controlled experiments are not feasible or when a treatment is not successfully delivered t ...
techniques are commonly used to address this problem. Besides simultaneity, correlation between explanatory variables and the error term can arise when an unobserved or omitted variable is confounding both independent and dependent variables, or when independent variables are measured with error.


Exogeneity versus endogeneity

In a stochastic model, the notion of the ''usual exogeneity'', ''sequential exogeneity'', ''strong/strict exogeneity'' can be defined. Exogeneity is articulated in such a way that a variable or variables is exogenous for parameter \alpha. Even if a variable is exogenous for parameter \alpha, it might be endogenous for parameter \beta. When the explanatory variables are not stochastic, then they are strong exogenous for all the parameters. If the independent variable is
correlated In statistics, correlation or dependence is any statistical relationship, whether causal or not, between two random variables or bivariate data. Although in the broadest sense, "correlation" may indicate any type of association, in statistic ...
with the
error term In mathematics and statistics, an error term is an additive type of error. Common examples include: * errors and residuals in statistics, e.g. in linear regression * the error term in numerical integration In analysis, numerical integration ...
in a regression model then the estimate of the regression coefficient in an
ordinary least squares In statistics, ordinary least squares (OLS) is a type of linear least squares method for choosing the unknown parameters in a linear regression model (with fixed level-one effects of a linear function of a set of explanatory variables) by the ...
(OLS) regression is biased; however if the correlation is not contemporaneous, then the coefficient estimate may still be
consistent In classical deductive logic, a consistent theory is one that does not lead to a logical contradiction. The lack of contradiction can be defined in either semantic or syntactic terms. The semantic definition states that a theory is consistent ...
. There are many methods of correcting the bias, including
instrumental variable In statistics, econometrics, epidemiology and related disciplines, the method of instrumental variables (IV) is used to estimate causal relationships when controlled experiments are not feasible or when a treatment is not successfully delivered t ...
regression and Heckman selection correction.


Static models

The following are some common sources of endogeneity.


Omitted variable

In this case, the endogeneity comes from an uncontrolled
confounding variable In statistics, a confounder (also confounding variable, confounding factor, extraneous determinant or lurking variable) is a variable that influences both the dependent variable and independent variable, causing a spurious association. Con ...
, a variable that is correlated with both the independent variable in the model and with the error term. (Equivalently, the omitted variable affects the independent variable and separately affects the dependent variable.) Assume that the "true" model to be estimated is : y_i = \alpha + \beta x_i + \gamma z_i + u_i but z_i is omitted from the regression model (perhaps because there is no way to measure it directly). Then the model that is actually estimated is : y_i = \alpha + \beta x_i + \varepsilon_i where \varepsilon_i=\gamma z_i + u_i (thus, the z_i term has been absorbed into the error term). If the correlation of x and z is not 0 and z separately affects y (meaning \gamma \neq 0), then x is correlated with the error term \varepsilon. Here, x is not exogenous for \alpha and \beta, since, given x, the distribution of y depends not only on \alpha and \beta, but also on z and \gamma.


Measurement error

Suppose that a perfect measure of an independent variable is impossible. That is, instead of observing x^_, what is actually observed is x_i=x^_+ \nu_i where \nu_i is the measurement error or "noise". In this case, a model given by : y_i = \alpha+\beta x^_i + \varepsilon_i can be written in terms of observables and error terms as : \begin y_i & = \alpha+\beta(x_i-\nu_i) + \varepsilon_i \\ pty_i & = \alpha+\beta x_i +(\varepsilon_i - \beta\nu_i) \\ pty_i & = \alpha+\beta x_i +u_i \quad (\text u_i=\varepsilon_i - \beta\nu_i) \end Since both x_i and u_i depend on \nu_i, they are correlated, so the OLS estimation of \beta will be biased downward. Measurement error in the dependent variable, y_i, does not cause endogeneity, though it does increase the variance of the error term.


Simultaneity

Suppose that two variables are codetermined, with each affecting the other according to the following "structural" equations: :y_i = \beta_1 x_i + \gamma_1 z_i + u_i :z_i = \beta_2 x_i + \gamma_2 y_i + v_i Estimating either equation by itself results in endogeneity. In the case of the first structural equation, E(z_i u_i) \neq 0. Solving for z_i while assuming that 1-\gamma_1 \gamma_2 \neq 0 results in :z_i = \fracx_i+\fracv_i+\fracu_i. Assuming that x_i and \gamma_i are uncorrelated with u_i, :\operatorname E(z_i u_i) = \frac\operatorname E(u_i u_i) \neq 0. Therefore, attempts at estimating either structural equation will be hampered by endogeneity.


Dynamic models

The endogeneity problem is particularly relevant in the context of
time series In mathematics, a time series is a series of data points indexed (or listed or graphed) in time order. Most commonly, a time series is a sequence taken at successive equally spaced points in time. Thus it is a sequence of discrete-time data. Exa ...
analysis of
causal Causality (also referred to as causation, or cause and effect) is influence by which one event, process, state, or object (''a'' ''cause'') contributes to the production of another event, process, state, or object (an ''effect'') where the ca ...
processes. It is common for some factors within a causal system to be dependent for their value in period ''t'' on the values of other factors in the causal system in period ''t'' − 1. Suppose that the level of pest infestation is independent of all other factors within a given period, but is influenced by the level of rainfall and fertilizer in the preceding period. In this instance it would be correct to say that infestation is
exogenous In a variety of contexts, exogeny or exogeneity () is the fact of an action or object originating externally. It contrasts with endogeneity or endogeny, the fact of being influenced within a system. Economics In an economic model, an exogeno ...
within the period, but endogenous over time. Let the model be ''y'' = ''f''(''x'', ''z'') + ''u''. If the variable ''x'' is sequential exogenous for parameter \alpha, and ''y'' does not cause ''x'' in the Granger sense, then the variable ''x'' is strongly/strictly exogenous for the parameter \alpha.


Simultaneity

Generally speaking, simultaneity occurs in the dynamic model just like in the example of static simultaneity above.


See also

*
Virtuous circle and vicious circle A vicious circle (or cycle) is a complex chain of events that reinforces itself through a feedback loop, with detrimental results. It is a system with no tendency toward equilibrium (social, economic, ecological, etc.), at least in the short ...
*
Heterogeneity Homogeneity and heterogeneity are concepts often used in the sciences and statistics relating to the uniformity of a substance or organism. A material or image that is homogeneous is uniform in composition or character (i.e. color, shape, siz ...
* Dependent and independent variables


References


Further reading

* * *


External links

* * {{YouTube, id=WlOtUA8Rqw8&list=PLD15D38DC7AA3B737&index=14#t=7m42s, title=Lecture on Simultaneity Bias by
Mark Thoma Mark Allen Thoma (born December 15, 1956) is a macroeconomist and econometrician and a professor of economics at the Department of Economics of the University of Oregon. Thoma is best known as a regular columnist for ''The Fiscal Times'' throug ...

Seth Godin's simple views on endogeneity
Causality Estimation theory Econometric models