In statistics, a tobit model is any of a class of
regression models
Regression or regressions may refer to:
Science
* Marine regression, coastal advance due to falling sea level, the opposite of marine transgression
* Regression (medicine), a characteristic of diseases to express lighter symptoms or less extent ...
in which the observed range of the
dependent variable
Dependent and independent variables are variables in mathematical modeling, statistical modeling and experimental sciences. Dependent variables receive this name because, in an experiment, their values are studied under the supposition or dema ...
is
censored
Censorship is the suppression of speech, public communication, or other information. This may be done on the basis that such material is considered objectionable, harmful, sensitive, or "inconvenient". Censorship can be conducted by governments ...
in some way. The term was coined by
Arthur Goldberger
Arthur Stanley Goldberger (November 20, 1930 – December 11, 2009) was an econometrician and an economist. He worked with Nobel Prize winner Lawrence Klein on the development of the Klein–Goldberger macroeconomic model at the University of ...
in reference to
James Tobin
James Tobin (March 5, 1918 – March 11, 2002) was an American economist who served on the Council of Economic Advisers and consulted with the Board of Governors of the Federal Reserve System, and taught at Harvard and Yale Universities. He d ...
, who developed the model in 1958 to mitigate the problem of
zero-inflated data for observations of household expenditure on
durable goods
In economics, a durable good or a hard good or consumer durable is a good that does not quickly wear out or, more specifically, one that yields utility over time rather than being completely consumed in one use. Items like bricks could be con ...
. Because Tobin's method can be easily extended to handle
truncated
Truncation is the term used for limiting the number of digits right of the decimal point by discarding the least significant ones.
Truncation may also refer to:
Mathematics
* Truncation (statistics) refers to measurements which have been cut of ...
and other non-randomly selected samples, some authors adopt a broader definition of the tobit model that includes these cases.
Tobin's idea was to modify the
likelihood function
The likelihood function (often simply called the likelihood) represents the probability of random variable realizations conditional on particular values of the statistical parameters. Thus, when evaluated on a given sample, the likelihood funct ...
so that it reflects the unequal
sampling probability
In statistics, in the theory relating to sampling from finite populations, the sampling probability (also known as inclusion probability) of an element or member of the population, is its probability of becoming part of the sample during the draw ...
for each observation depending on whether the
latent dependent variable fell above or below the determined threshold. For a sample that, as in Tobin's original case, was censored from below at zero, the sampling probability for each non-limit observation is simply the height of the appropriate
density function
In probability theory, a probability density function (PDF), or density of a continuous random variable, is a function whose value at any given sample (or point) in the sample space (the set of possible values taken by the random variable) can ...
. For any limit observation, it is the cumulative distribution, i.e. the
integral
In mathematics, an integral assigns numbers to functions in a way that describes displacement, area, volume, and other concepts that arise by combining infinitesimal data. The process of finding integrals is called integration. Along with ...
below zero of the appropriate density function. The tobit likelihood function is thus a mixture of densities and cumulative distribution functions.
The likelihood function
Below are the
likelihood
The likelihood function (often simply called the likelihood) represents the probability of random variable realizations conditional on particular values of the statistical parameters. Thus, when evaluated on a given sample, the likelihood funct ...
and log likelihood functions for a type I tobit. This is a tobit that is censored from below at
when the latent variable
. In writing out the likelihood function, we first define an indicator function
:
:
Next, let
be the standard normal
cumulative distribution function
In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable X, or just distribution function of X, evaluated at x, is the probability that X will take a value less than or equal to x.
Ev ...
and
to be the standard normal
probability density function
In probability theory, a probability density function (PDF), or density of a continuous random variable, is a function whose value at any given sample (or point) in the sample space (the set of possible values taken by the random variable) c ...
. For a data set with ''N'' observations the likelihood function for a type I tobit is
:
and the log likelihood is given by
:
Reparametrization
The log-likelihood as stated above is not globally concave, which complicates the
maximum likelihood estimation
In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of an assumed probability distribution, given some observed data. This is achieved by maximizing a likelihood function so that, under the assumed stati ...
. Olsen suggested the simple reparametrization
and
, resulting in a transformed log-likelihood,
: