In statistics and

econometrics Econometrics is the application of statistical methods to economic data in order to give empirical content to economic relationships.M. Hashem Pesaran (1987). "Econometrics," '' The New Palgrave: A Dictionary of Economics'', v. 2, p. 8 p. 8� ...

, the first-difference (FD) estimator is an

estimator In statistics, an estimator is a rule for calculating an estimate of a given quantity based on observed data: thus the rule (the estimator), the quantity of interest (the estimand) and its result (the estimate) are distinguished. For example, the ...

used to address the problem of omitted variables with

panel data In statistics and econometrics, panel data and longitudinal data are both multi-dimensional data involving measurements over time. Panel data is a subset of longitudinal data where observations are for the same subjects each time. Time series and ...

. It is

consistent In classical deductive logic, a consistent theory is one that does not lead to a logical contradiction. The lack of contradiction can be defined in either semantic or syntactic terms. The semantic definition states that a theory is consisten ...

under the assumptions of the

fixed effects model In statistics, a fixed effects model is a statistical model in which the model parameters are fixed or non-random quantities. This is in contrast to random effects models and mixed models in which all or some of the model parameters are random ...

. In certain situations it can be more efficient than the standard fixed effects (or "within") estimator. The estimator requires data on a dependent variable,

y_

, and independent variables,

x_

, for a set of individual units

i = 1, \dots, N

and time periods

t = 1, \dots, T

. The estimator is obtained by running a pooled

ordinary least squares In statistics, ordinary least squares (OLS) is a type of linear least squares method for choosing the unknown parameters in a linear regression model (with fixed level-one effects of a linear function of a set of explanatory variables) by the ...

(OLS) estimation for a regression of

\Delta y_

\Delta x_

Derivation

The FD estimator avoids bias due to some unobserved, time-invariant variable

c_

, using the repeated observations over time: :

y_=x_\beta + c_+ u_, t=1,...T ,

y_=x_\beta + c_+u_, t=2,...T .

Differencing the equations, gives: :

\Delta y_=y_-y_=\Delta x_\beta + \Delta u_, t=2,...T ,

which removes the unobserved

c_

. The FD estimator

\hat_

is then obtained by using the differenced terms for x and u in OLS: :

\hat_ = (\Delta X'\Delta X)^\Delta X' \Delta y=\beta + (\Delta X'\Delta X)^\Delta X' \Delta u

:Where

X,y,

and

u

, are notation for matrices of relevant variables. Note that the

rank condition Simultaneous equations models are a type of statistical model in which the dependent variables are functions of other dependent variables, rather than just independent variables. This means some of the explanatory variables are jointly determined ...

must be met for

\Delta X'\Delta X

to be invertible (

rank Delta X'\Delta X k

) where

k

is the number of regressors. :Let

\Delta X_i = Delta X_, \Delta X_, ..., \Delta X_

and define

\Delta u_i

analogously. If

E x_, x_, .., x_0

, by the Central limit theorem, Law of large numbers, and Slutsky's theorem, the estimator is distributed normally with asymptotic variance of

E Delta X_i' \Delta X_i E Delta X_i \Delta u_i \Delta u_i' Delta X_i'\Delta X_i

. Under the assumption of homoskedasticity and no serial correlation, mathematically that,

Var(\Delta u ,  X)=\sigma^2_

, the asymptotic variance can be estimated with :

\widehat(\hat_)=\hat^_(\Delta X'\Delta X)^ ,

where

\hat^_

is given by :

\sum_^n\sum_^T \widehat^2

:and

\widehat=\Delta y_-\hat_\Delta x_

Properties

To be unbiased, the fixed estimator (FE) requires strict exogeneity,

E x_, x_, .., x_0

. The first difference estimator is also unbiased under this assumption. Under the weaker assumption that

E u_-u_)(x_-x_) 0

, the FD estimator is consistent. Note that this assumption is less restrictive than the assumption of strict exogeneity which is required for consistency using the FE estimator when T is fixed. If T goes to infinity, then both FE and FD are consistent with the weaker assumption of contemporaneous exogeneity.

Relation to fixed effects estimator

For

T=2

, the FD and fixed effects estimators are numerically equivalent. Under the assumption of

homoscedasticity In statistics, a sequence (or a vector) of random variables is homoscedastic () if all its random variables have the same finite variance. This is also known as homogeneity of variance. The complementary notion is called heteroscedasticity. The s ...

and no

serial correlation Autocorrelation, sometimes known as serial correlation in the discrete time case, is the correlation of a signal with a delayed copy of itself as a function of delay. Informally, it is the similarity between observations of a random variable as ...

u_

, the FE estimator is more efficient than the FD estimator. This is because the FD estimator induces no serial correlation when differencing the errors. If

u_

follows a

random walk In mathematics, a random walk is a random process that describes a path that consists of a succession of random steps on some mathematical space. An elementary example of a random walk is the random walk on the integer number line \mathbb ...

, however, the FD estimator is more efficient as

\Delta u_

are serially uncorrelated.

References

*{{cite book , first=Jeffrey M. , last=Wooldridge , title=Econometric Analysis of Cross Section and Panel Data , url=https://archive.org/details/econometricanaly0000wool , url-access=registration , year=2001 , publisher=MIT Press , isbn=978-0-262-23219-7 , page
279
��291 Estimator Latent variable models

Derivation

Properties

Relation to fixed effects estimator

See also

References