Censored regression models are a class of
models
A model is an informative representation of an object, person, or system. The term originally denoted the plans of a building in late 16th-century English, and derived via French and Italian ultimately from Latin , .
Models can be divided int ...
in which the
dependent variable
A variable is considered dependent if it depends on (or is hypothesized to depend on) an independent variable. Dependent variables are studied under the supposition or demand that they depend, by some law or rule (e.g., by a mathematical functio ...
is
censored above or below a certain threshold. A commonly used
likelihood-based model to accommodate to a censored sample is the
Tobit model, but
quantile
In statistics and probability, quantiles are cut points dividing the range of a probability distribution into continuous intervals with equal probabilities or dividing the observations in a sample in the same way. There is one fewer quantile t ...
and
nonparametric estimators have also been developed.
These and other censored regression models are often confused with
truncated regression models. Truncated regression models are used for data where whole observations are missing so that the values for the dependent and the independent variables are unknown. Censored regression models are used for data where only the value for the dependent variable is unknown while the values of the independent variables are still available.
Censored dependent variables frequently arise in
econometrics
Econometrics is an application of statistical methods to economic data in order to give empirical content to economic relationships. M. Hashem Pesaran (1987). "Econometrics", '' The New Palgrave: A Dictionary of Economics'', v. 2, p. 8 p. 8 ...
. A common example is
labor supply
In Mainstream economics, mainstream economic theories, the labour supply is the total hours (adjusted for intensity of effort) that workers wish to work at a given real wage rate. It is frequently represented graphically by a labour supply curve, ...
. Data are frequently available on the hours worked by employees, and a labor supply model estimates the relationship between hours worked and characteristics of employees such as age, education and family status. However, such estimates undertaken using
linear regression
In statistics, linear regression is a statistical model, model that estimates the relationship between a Scalar (mathematics), scalar response (dependent variable) and one or more explanatory variables (regressor or independent variable). A mode ...
will be biased by the fact that for people who are unemployed it is not possible to observe the number of hours they would have worked had they had employment. Still we know age, education and family status for those observations.
See also
*
Nonlinear regression
In statistics, nonlinear regression is a form of regression analysis in which observational data are modeled by a function which is a nonlinear combination of the model parameters and depends on one or more independent variables. The data are fi ...
*
Nonparametric regression
Nonparametric regression is a form of regression analysis where the predictor does not take a predetermined form but is completely constructed using information derived from the data. That is, no parametric equation is assumed for the relationshi ...
*
Sampling bias
In statistics, sampling bias is a bias (statistics), bias in which a sample is collected in such a way that some members of the intended statistical population, population have a lower or higher sampling probability than others. It results in a b ...
*
Truncated normal hurdle model
References
Further reading
*
*
Regression models
Single-equation methods (econometrics)
Mathematical and quantitative methods (economics)
{{econometrics-stub