In
statistics, ordered probit is a generalization of the widely used
probit
In probability theory and statistics, the probit function is the quantile function associated with the standard normal distribution. It has applications in data analysis and machine learning, in particular exploratory statistical graphics and s ...
analysis to the case of more than two outcomes of an
ordinal dependent variable
Dependent and independent variables are variables in mathematical modeling, statistical modeling and experimental sciences. Dependent variables receive this name because, in an experiment, their values are studied under the supposition or demand ...
(a dependent variable for which the potential values have a natural ordering, as in poor, fair, good, excellent). Similarly, the widely used
logit method also has a counterpart
ordered logit. Ordered probit, like ordered logit, is a particular method of
ordinal regression.
For example, in
clinical research
Clinical research is a branch of healthcare science that determines the safety and effectiveness (efficacy) of medications, devices, diagnostic products and treatment regimens intended for human use. These may be used for prevention, treatmen ...
, the effect a drug may have on a patient may be modeled with ordered probit regression. Independent variables may include the use or non-use of the drug as well as control variables such as age and details from medical history such as whether the patient suffers from high
blood pressure, heart disease, etc. The dependent variable would be ranked from the following list: complete cure, relieve symptoms, no effect, deteriorate condition, death.
Another example application are
Likert-type items commonly employed in survey research, where respondents rate their agreement on an ordered scale (e.g., "Strongly disagree" to "Strongly agree"). The ordered probit model provides an appropriate fit to these data, preserving the ordering of response options while making no assumptions of the interval distances between options.
Conceptual underpinnings
Suppose the underlying relationship to be characterized is
:
,
where
is the exact but unobserved dependent variable (perhaps the exact level of improvement by the patient);
is the vector of independent variables, and
is the vector of regression coefficients which we wish to estimate. Further suppose that while we cannot observe
, we instead can only observe the categories of response:
: