Tobit Model
   HOME

TheInfoList



OR:

In statistics, a tobit model is any of a class of
regression models Regression or regressions may refer to: Science * Marine regression, coastal advance due to falling sea level, the opposite of marine transgression * Regression (medicine), a characteristic of diseases to express lighter symptoms or less extent ( ...
in which the observed range of the
dependent variable Dependent and independent variables are variables in mathematical modeling, statistical modeling and experimental sciences. Dependent variables receive this name because, in an experiment, their values are studied under the supposition or demand ...
is
censored Censorship is the suppression of speech, public communication, or other information. This may be done on the basis that such material is considered objectionable, harmful, sensitive, or "inconvenient". Censorship can be conducted by governments ...
in some way. The term was coined by
Arthur Goldberger Arthur Stanley Goldberger (November 20, 1930 – December 11, 2009) was an econometrician and an economist. He worked with Nobel Prize winner Lawrence Klein on the development of the Klein–Goldberger macroeconomic model at the University of ...
in reference to
James Tobin James Tobin (March 5, 1918 – March 11, 2002) was an American economist who served on the Council of Economic Advisers and consulted with the Board of Governors of the Federal Reserve System, and taught at Harvard and Yale Universities. He devel ...
, who developed the model in 1958 to mitigate the problem of zero-inflated data for observations of household expenditure on
durable goods In economics, a durable good or a hard good or consumer durable is a good that does not quickly wear out or, more specifically, one that yields utility over time rather than being completely consumed in one use. Items like bricks could be consi ...
. Because Tobin's method can be easily extended to handle truncated and other non-randomly selected samples, some authors adopt a broader definition of the tobit model that includes these cases. Tobin's idea was to modify the
likelihood function The likelihood function (often simply called the likelihood) represents the probability of random variable realizations conditional on particular values of the statistical parameters. Thus, when evaluated on a given sample, the likelihood funct ...
so that it reflects the unequal
sampling probability In statistics, in the theory relating to sampling from finite populations, the sampling probability (also known as inclusion probability) of an element or member of the population, is its probability of becoming part of the sample during the dra ...
for each observation depending on whether the latent dependent variable fell above or below the determined threshold. For a sample that, as in Tobin's original case, was censored from below at zero, the sampling probability for each non-limit observation is simply the height of the appropriate
density function In probability theory, a probability density function (PDF), or density of a continuous random variable, is a function whose value at any given sample (or point) in the sample space (the set of possible values taken by the random variable) can ...
. For any limit observation, it is the cumulative distribution, i.e. the
integral In mathematics Mathematics is an area of knowledge that includes the topics of numbers, formulas and related structures, shapes and the spaces in which they are contained, and quantities and their changes. These topics are represented i ...
below zero of the appropriate density function. The tobit likelihood function is thus a mixture of densities and cumulative distribution functions.


The likelihood function

Below are the
likelihood The likelihood function (often simply called the likelihood) represents the probability of random variable realizations conditional on particular values of the statistical parameters. Thus, when evaluated on a given sample, the likelihood funct ...
and log likelihood functions for a type I tobit. This is a tobit that is censored from below at y_L when the latent variable y_j^* \leq y_L . In writing out the likelihood function, we first define an indicator function I : : I(y) = \begin 0 & \text y \leq y_L, \\ 1 & \text y > y_L. \end Next, let \Phi be the standard normal
cumulative distribution function In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable X, or just distribution function of X, evaluated at x, is the probability that X will take a value less than or equal to x. Ev ...
and \varphi to be the standard normal
probability density function In probability theory, a probability density function (PDF), or density of a continuous random variable, is a function whose value at any given sample (or point) in the sample space (the set of possible values taken by the random variable) can ...
. For a data set with ''N'' observations the likelihood function for a type I tobit is : \mathcal(\beta, \sigma) = \prod _^N \left(\frac\varphi \left(\frac \right)\right)^ \left(1-\Phi \left(\frac\right)\right)^ and the log likelihood is given by :\begin \log \mathcal(\beta, \sigma) &= \sum^n_ I(y_j) \log \left( \frac \varphi\left( \frac \right) \right) + (1 - I(y_j)) \log\left( 1- \Phi\left( \frac \right) \right) \\ &= \sum_ \log \left( \frac \varphi\left( \frac \right) \right) + \sum_ \log\left( \Phi\left( \frac \right) \right) \end


Reparametrization

The log-likelihood as stated above is not globally concave, which complicates the
maximum likelihood estimation In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of an assumed probability distribution, given some observed data. This is achieved by maximizing a likelihood function so that, under the assumed statis ...
. Olsen suggested the simple reparametrization \beta = \delta/\gamma and \sigma^2 = \gamma^, resulting in a transformed log-likelihood, :\log \mathcal(\delta, \gamma) = \sum_ \left\ + \sum_ \log\left \Phi\left( \gamma y_L - X_j \delta \right) \right/math> which is globally concave in terms of the transformed parameters. For the truncated (tobit II) model, Orme showed that while the log-likelihood is not globally concave, it is concave at any
stationary point In mathematics, particularly in calculus, a stationary point of a differentiable function of one variable is a point on the graph of the function where the function's derivative is zero. Informally, it is a point where the function "stops" inc ...
under the above transformation.


Consistency

If the relationship parameter \beta is estimated by regressing the observed y_i on x_i , the resulting ordinary
least squares The method of least squares is a standard approach in regression analysis to approximate the solution of overdetermined systems (sets of equations in which there are more equations than unknowns) by minimizing the sum of the squares of the res ...
regression estimator is
inconsistent In classical deductive logic, a consistent theory is one that does not lead to a logical contradiction. The lack of contradiction can be defined in either semantic or syntactic terms. The semantic definition states that a theory is consistent ...
. It will yield a downwards-biased estimate of the slope coefficient and an upward-biased estimate of the intercept.
Takeshi Amemiya is an economist specializing in econometrics and the economy of ancient Greece. Amemiya is the Edward Ames Edmonds Professor of Economics (emeritus) and a Professor of Classics at Stanford University. He is a Fellow of the Econometric Soc ...
(1973) has proven that the
maximum likelihood estimator In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of an assumed probability distribution, given some observed data. This is achieved by maximizing a likelihood function so that, under the assumed statist ...
suggested by Tobin for this model is consistent.


Interpretation

The \beta coefficient should not be interpreted as the effect of x_i on y_i, as one would with a
linear regression model In statistics, linear regression is a linear approach for modelling the relationship between a scalar response and one or more explanatory variables (also known as dependent and independent variables). The case of one explanatory variable is cal ...
; this is a common error. Instead, it should be interpreted as the combination of (1) the change in y_i of those above the limit, weighted by the probability of being above the limit; and (2) the change in the probability of being above the limit, weighted by the expected value of y_i if above.


Variations of the tobit model

Variations of the tobit model can be produced by changing where and when censoring occurs. classifies these variations into five categories (tobit type I – tobit type V), where tobit type I stands for the first model described above. Schnedler (2005) provides a general formula to obtain consistent likelihood estimators for these and other variations of the tobit model.


Type I

The tobit model is a special case of a
censored regression model Censored regression models are a class of models in which the dependent variable is censored above or below a certain threshold. A commonly used likelihood-based model to accommodate to a censored sample is the Tobit model, but quantile and nonp ...
, because the latent variable y_i^* cannot always be observed while the independent variable x_i is observable. A common variation of the tobit model is censoring at a value y_L different from zero: : y_i = \begin y_i^* & \text y_i^* >y_L, \\ y_L & \text y_i^* \leq y_L. \end Another example is censoring of values above y_U. : y_i = \begin y_i^* & \text y_i^* Yet another model results when y_i is censored from above and below at the same time. : y_i = \begin y_i^* & \text y_L The rest of the models will be presented as being bounded from below at 0, though this can be generalized as done for Type I.


Type II

Type II tobit models introduce a second latent variable. : y_ = \begin y_^* & \text y_^* >0, \\ 0 & \text y_^* \leq 0. \end In Type I tobit, the latent variable absorbs both the process of participation and the outcome of interest. Type II tobit allows the process of participation (selection) and the outcome of interest to be independent, conditional on observable data. The Heckman selection model falls into the Type II tobit, which is sometimes called Heckit after
James Heckman James Joseph Heckman (born April 19, 1944) is a Nobel Prize-winning American economist at the University of Chicago, where he is The Henry Schultz Distinguished Service Professor in Economics and the College; Professor at the Harris School of Pub ...
.


Type III

Type III introduces a second observed dependent variable. : y_ = \begin y_^* & \text y_^* >0, \\ 0 & \text y_^* \leq 0. \end : y_ = \begin y_^* & \text y_^* >0, \\ 0 & \text y_^* \leq 0. \end The Heckman model falls into this type.


Type IV

Type IV introduces a third observed dependent variable and a third latent variable. : y_ = \begin y_^* & \text y_^* >0, \\ 0 & \text y_^* \leq 0. \end : y_ = \begin y_^* & \text y_^* >0, \\ 0 & \text y_^* \leq 0. \end : y_ = \begin y_^* & \text y_^* \leq0, \\ 0 & \text y_^* <0. \end


Type V

Similar to Type II, in Type V only the sign of y_^* is observed. : y_ = \begin y_^* & \text y_^* >0, \\ 0 & \text y_^* \leq 0. \end : y_ = \begin y_^* & \text y_^* \leq 0, \\ 0 & \text y_^* > 0. \end


Non-parametric version

If the underlying latent variable y_i^* is not normally distributed, one must use quantiles instead of moments to analyze the observable variable y_i. Powell's CLAD estimator offers a possible way to achieve this.


Applications

Tobit models have, for example, been applied to estimate factors that impact grant receipt, including financial transfers distributed to sub-national governments who may apply for these grants. In these cases, grant recipients cannot receive negative amounts, and the data is thus left-censored. For instance, Dahlberg and Johansson (2002) analyse a sample of 115 municipalities (42 of which received a grant). Dubois and Fattore (2011) use a tobit model to investigate the role of various factors in European Union fund receipt by applying Polish sub-national governments. The data may however be left-censored at a point higher than zero, with the risk of mis-specification. Both studies apply Probit and other models to check for robustness. Tobit models have also been applied in demand analysis to accommodate observations with zero expenditures on some goods. In a related application of tobit models, a system of nonlinear tobit regressions models has been used to jointly estimate a brand demand system with homoscedastic, heteroscedastic and generalized heteroscedastic variants.


See also

*
Truncated normal hurdle model In econometrics, the truncated normal hurdle model is a variant of the Tobit model and was first proposed by Cragg in 1971. In a standard Tobit model, represented as y=(x\beta+u) 1 0.html" ;"title="\beta+u>0">\beta+u>0/math>, where u, x\sim N(0,\ ...
*
Limited dependent variable A limited dependent variable is a variable whose range of possible values is "restricted in some important way." In econometrics, the term is often used when estimation of the relationship between the ''limited'' dependent variable of interest ...
*
Rectifier (neural networks) In the context of artificial neural networks, the rectifier or ReLU (rectified linear unit) activation function is an activation function defined as the positive part of its argument: : f(x) = x^+ = \max(0, x), where ''x'' is the input to a ne ...
*
Truncated regression model Truncated regression models are a class of models in which the sample has been truncated for certain ranges of the dependent variable. That means observations with values in the dependent variable below or above certain thresholds are systematic ...
* *
Probit model In statistics, a probit model is a type of regression where the dependent variable can take only two values, for example married or not married. The word is a portmanteau, coming from ''probability'' + ''unit''. The purpose of the model is to est ...
, the name ''tobit'' is a pun on both Tobin, their creator, and their similarities to probit models.


Notes


References


Further reading

* * * * * {{Economics Regression models Single-equation methods (econometrics)