The topic of heteroskedasticity-consistent (HC) standard errors arises in

statistics Statistics (from German: '' Statistik'', "description of a state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, indust ...

and

econometrics Econometrics is the application of statistical methods to economic data in order to give empirical content to economic relationships. M. Hashem Pesaran (1987). "Econometrics," '' The New Palgrave: A Dictionary of Economics'', v. 2, p. 8 p. ...

in the context of

linear regression In statistics, linear regression is a linear approach for modelling the relationship between a scalar response and one or more explanatory variables (also known as dependent and independent variables). The case of one explanatory variable is cal ...

and

time series analysis In mathematics Mathematics is an area of knowledge that includes the topics of numbers, formulas and related structures, shapes and the spaces in which they are contained, and quantities and their changes. These topics are represented in m ...

. These are also known as heteroskedasticity-robust standard errors (or simply robust standard errors), Eicker–Huber–White standard errors (also Huber–White standard errors or White standard errors), to recognize the contributions of Friedhelm Eicker,

Peter J. Huber Peter Jost Huber (born 25 March 1934) is a Swiss statistician. He is known for his contributions to the development of heteroscedasticity-consistent standard errors. A native of Wohlen, Aargau, Huber earned his Ph.D. at the ETH Zürich in 1962, ...

, and

Halbert White Halbert Lynn White Jr. (November 19, 1950 – March 31, 2012) was the Chancellor’s Associates Distinguished Professor of Economics at the University of California, San Diego, and a Fellow of the Econometric Society and the American Academy of ...

. In regression and time-series modelling, basic forms of models make use of the assumption that the errors or disturbances ''u''_''i'' have the same variance across all observation points. When this is not the case, the errors are said to be heteroskedastic, or to have

heteroskedasticity In statistics, a sequence (or a vector) of random variables is homoscedastic () if all its random variables have the same finite variance. This is also known as homogeneity of variance. The complementary notion is called heteroscedasticity. The s ...

, and this behaviour will be reflected in the residuals

\widehat_i

estimated from a fitted model. Heteroskedasticity-consistent standard errors are used to allow the fitting of a model that does contain heteroskedastic residuals. The first such approach was proposed by Huber (1967), and further improved procedures have been produced since for cross-sectional data,

time-series In mathematics, a time series is a series of data points indexed (or listed or graphed) in time order. Most commonly, a time series is a sequence taken at successive equally spaced points in time. Thus it is a sequence of discrete-time data. Ex ...

data and GARCH estimation. Heteroskedasticity-consistent standard errors that differ from classical standard errors may indicate model misspecification. Substituting heteroskedasticity-consistent standard errors does not resolve this misspecification, which may lead to bias in the coefficients. In most situations, the problem should be found and fixed. Other types of standard error adjustments, such as clustered standard errors or HAC standard errors, may be considered as extensions to HC standard errors.

History

Heteroskedasticity-consistent standard errors are introduced by Friedhelm Eicker, and popularized in econometrics by

Problem

Consider the linear regression model for the scalar ''Y''. :

y = \mathbf^ \boldsymbol + \varepsilon, \,

where

\mathbf

is a ''k'' x 1 column vector of explanatory variables (features),

\boldsymbol

is a ''k'' × 1 column vector of parameters to be estimated, and

\varepsilon

is the residual error. The

ordinary least squares In statistics, ordinary least squares (OLS) is a type of linear least squares method for choosing the unknown parameters in a linear regression model (with fixed level-one effects of a linear function of a set of explanatory variables) by the ...

(OLS) estimator is :

\widehat \boldsymbol_\mathrm = (\mathbf^ \mathbf)^ \mathbf^ \mathbf. \,

where

\mathbf

is a vector of observations

y_i

, and

\mathbf

denotes the matrix of stacked

\mathbf_i

values observed in the data. If the sample errors have equal variance

\sigma^2

and are

uncorrelated In probability theory and statistics, two real-valued random variables, X, Y, are said to be uncorrelated if their covariance, \operatorname ,Y= \operatorname Y- \operatorname \operatorname /math>, is zero. If two variables are uncorrelated, ther ...

, then the least-squares estimate of

\boldsymbol

BLUE Blue is one of the three primary colours in the RYB colour model (traditional colour theory), as well as in the RGB (additive) colour model. It lies between violet and cyan on the spectrum of visible light. The eye perceives blue when ...

(best linear unbiased estimator), and its variance is estimated with :

= s^2 (\mathbf^\mathbf)^, \quad s^2 = \frac

where

\widehat \varepsilon_i = y_i - \mathbf_i^ \widehat \boldsymbol_\mathrm

are the regression residuals. When the error terms do not have constant variance (i.e., the assumption of

= \sigma^2 \mathbf_n

is untrue), the OLS estimator loses its desirable properties. The formula for variance now cannot be simplified: :

= (\mathbf^\mathbf)^ \mathbf^ \mathbf \mathbf (\mathbf^\mathbf)^

where

\mathbf = \mathbb mathbf

While the OLS point estimator remains unbiased, it is not "best" in the sense of having minimum mean square error, and the OLS variance estimator

\hat \left \widehat \boldsymbol_\mathrm \right /math> does not provide a consistent estimate of the variance of the OLS estimates.

For any non-linear model (for instance

logit In statistics, the logit ( ) function is the quantile function associated with the standard logistic distribution. It has many uses in data analysis and machine learning, especially in data transformations. Mathematically, the logit is the ...

and

probit In probability theory and statistics, the probit function is the quantile function associated with the standard normal distribution. It has applications in data analysis and machine learning, in particular exploratory statistical graphics and s ...

models), however, heteroskedasticity has more severe consequences: the maximum likelihood estimates of the parameters will be biased (in an unknown direction), as well as inconsistent (unless the likelihood function is modified to correctly take into account the precise form of heteroskedasticity). As pointed out by

Greene Greene may refer to: Places United States *Greene, Indiana, an unincorporated community *Greene, Iowa, a city *Greene, Maine, a town ** Greene (CDP), Maine, in the town of Greene *Greene (town), New York ** Greene (village), New York, in the town ...

, “simply computing a robust covariance matrix for an otherwise inconsistent estimator does not give it redemption.”

Solution

If the regression errors

\varepsilon_i

are independent, but have distinct variances

\sigma^2_i

, then

\mathbf = \operatorname(\sigma_1^2, \ldots, \sigma_n^2)

which can be estimated with

\widehat\sigma_i^2 = \widehat \varepsilon_i^2

. This provides White's (1980) estimator, often referred to as ''HCE'' (heteroskedasticity-consistent estimator): :

&= \frac \bigg(\frac \sum_i \mathbf_i \mathbf_i^ \bigg)^ \bigg(\frac \sum_i \mathbf_i \mathbf_i^\top \widehat_i^2 \bigg) \bigg(\frac \sum_i \mathbf_i \mathbf_i^ \bigg)^ \\ &= ( \mathbf^ \mathbf )^ ( \mathbf^ \operatorname(\widehat \varepsilon_1^2, \ldots, \widehat \varepsilon_n^2) \mathbf ) ( \mathbf^ \mathbf)^, \end

where as above

\mathbf

denotes the matrix of stacked

\mathbf_i^

values from the data. The estimator can be derived in terms of the generalized method of moments (GMM). Also often discussed in the literature (including White's paper) is the covariance matrix

\widehat\mathbf_n

of the

\sqrt

-consistent limiting distribution: :

\sqrt(\widehat \boldsymbol_n - \boldsymbol)  \, \xrightarrow \, \mathcal(\mathbf, \mathbf),

where :

\mathbf = \mathbb mathbf \mathbf^\mathbb mathbf \boldsymbol operatorname \mathbb mathbf \mathbf^,

and :

\begin
\widehat\mathbf_n &= \bigg(\frac \sum_i \mathbf_i \mathbf_i^ \bigg)^ \bigg(\frac \sum_i \mathbf_i \mathbf_i^ \widehat \varepsilon_i^2 \bigg) \bigg(\frac \sum_i \mathbf_i \mathbf_i^ \bigg)^ \\
&= n ( \mathbf^ \mathbf )^ ( \mathbf^ \operatorname(\widehat \varepsilon_1^2, \ldots, \widehat \varepsilon_n^2)  \mathbf ) ( \mathbf^ \mathbf)^
\end

Thus, :

\widehat \mathbf_n = n \cdot \hat_\text widehat \boldsymbol_\text

and :

= \frac \sum_i \mathbf_i \mathbf_i^ \widehat \varepsilon_i^2 = \frac \mathbf^ \operatorname(\widehat \varepsilon_1^2, \ldots, \widehat \varepsilon_n^2) \mathbf.

Precisely which covariance matrix is of concern is a matter of context. Alternative estimators have been proposed in MacKinnon & White (1985) that correct for unequal variances of regression residuals due to different leverage. Unlike the asymptotic White's estimator, their estimators are unbiased when the data are homoscedastic. Of the four widely available different options, often denoted as HC0-HC3, the HC3 specification appears to work best, with tests relying on the HC3 estimator featuring better power and closer proximity to the targetted

size Size in general is the magnitude or dimensions of a thing. More specifically, ''geometrical size'' (or ''spatial size'') can refer to linear dimensions ( length, width, height, diameter, perimeter), area, or volume. Size can also be me ...

, especially in small samples. The larger the sample, the smaller the difference between the different estimators. An alternative to explicitly modelling the heteroskedasticity is using a resampling method such as the Wild Bootstrap. Given that the studentized Bootstrap, which standardizes the resampled statistic by its standard error, yields an asymptotic refinement, heteroskedasticity-robust standard errors remain nevertheless useful. Instead of accounting for the heteroskedastic errors, most linear models can be transformed to feature homoskedastic error terms (unless the error term is heteroskedastic by construction, e.g. in a Linear probability model). One way to do this is using

Weighted least squares Weighted least squares (WLS), also known as weighted linear regression, is a generalization of ordinary least squares and linear regression in which knowledge of the variance of observations is incorporated into the regression. WLS is also a speci ...

, which also features improved efficiency properties.

Software

EViews EViews is a statistical package for Windows, used mainly for time-series oriented econometric analysis. It is developed by Quantitative Micro Software (QMS), now a part of IHS. Version 1.0 was released in March 1994, and replaced MicroTSP. T ...

: EViews version 8 offers three different methods for robust least squares: M-estimation (Huber, 1973), S-estimation (Rousseeuw and Yohai, 1984), and MM-estimation (Yohai 1987). *

Julia Julia is usually a feminine given name. It is a Latinate feminine form of the name Julio and Julius. (For further details on etymology, see the Wiktionary entry "Julius".) The given name ''Julia'' had been in use throughout Late Antiquity (e.g ...

: the CovarianceMatrices package offers several methods for heteroskedastic robust variance covariance matrices. *

MATLAB MATLAB (an abbreviation of "MATrix LABoratory") is a proprietary multi-paradigm programming language and numeric computing environment developed by MathWorks. MATLAB allows matrix manipulations, plotting of functions and data, implementat ...

: See the hac function in the Econometrics toolbox. * Python: The Statsmodel package offers various robust standard error estimates, se
statsmodels.regression.linear_model.RegressionResults
for further descriptions * R: the vcovHC() command from the package. *

RATS Rats are various medium-sized, long-tailed rodents. Species of rats are found throughout the order Rodentia, but stereotypical rats are found in the genus ''Rattus''. Other rat genera include ''Neotoma'' (pack rats), '' Bandicota'' (bandicoot ...

: option is available in many of the regression and optimization commands (, , etc.). *

Stata Stata (, , alternatively , occasionally stylized as STATA) is a general-purpose statistical software package developed by StataCorp for data manipulation, visualization, statistics, and automated reporting. It is used by researchers in many fie ...

: robust option applicable in many pseudo-likelihood based procedures. *

Gretl gretl is an open-source statistical package, mainly for econometrics. The name is an acronym for ''G''nu ''R''egression, ''E''conometrics and ''T''ime-series ''L''ibrary. It has both a graphical user interface (GUI) and a command-line inter ...

: the option --robust to several estimation commands (such as ols) in the context of a cross-sectional dataset produces robust standard errors.

History

Problem

Solution

See also

Software

References

Further reading