HOME

TheInfoList



OR:

In
statistics Statistics (from German: '' Statistik'', "description of a state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, indust ...
and
econometrics Econometrics is the application of statistical methods to economic data in order to give empirical content to economic relationships. M. Hashem Pesaran (1987). "Econometrics," '' The New Palgrave: A Dictionary of Economics'', v. 2, p. 8 p. ...
, the multivariate probit model is a generalization of the probit model used to estimate several correlated binary outcomes jointly. For example, if it is believed that the decisions of sending at least one child to public school and that of voting in favor of a school budget are correlated (both decisions are binary), then the multivariate probit model would be appropriate for jointly predicting these two choices on an individual-specific basis. J.R. Ashford and R.R. Sowden initially proposed an approach for multivariate probit analysis.
Siddhartha Chib Siddhartha Chib is an econometrician and statistician, the Harry C. Hartkopf Professor of Econometrics and Statistics at Washington University in St. Louis. His work is primarily in Bayesian statistics, econometrics, and Markov chain Monte Carlo ...
and Edward Greenberg extended this idea and also proposed simulation-based inference methods for the multivariate probit model which simplified and generalized parameter estimation.


Example: bivariate probit

In the ordinary probit model, there is only one binary dependent variable Y and so only one
latent variable In statistics, latent variables (from Latin: present participle of ''lateo'', “lie hidden”) are variables that can only be inferred indirectly through a mathematical model from other observable variables that can be directly observed or me ...
Y^* is used. In contrast, in the bivariate probit model there are two binary dependent variables Y_1 and Y_2, so there are two latent variables: Y^*_1 and Y^*_2 . It is assumed that each observed variable takes on the value 1 if and only if its underlying continuous latent variable takes on a positive value: : Y_1 = \begin 1 & \textY^*_1>0, \\ 0 & \text, \end : Y_2 = \begin 1 & \textY^*_2>0, \\ 0 & \text, \end with : \begin Y_1^* = X_1\beta_1+\varepsilon_1 \\ Y_2^* = X_2\beta_2+\varepsilon_2 \end and : \begin \varepsilon_1\\ \varepsilon_2 \end \mid X \sim \mathcal \left( \begin 0\\ 0 \end, \begin 1&\rho\\ \rho&1 \end \right) Fitting the bivariate probit model involves estimating the values of \beta_1,\ \beta_2, and \rho . To do so, the likelihood of the model has to be maximized. This likelihood is : \begin L(\beta_1,\beta_2) =\Big( \prod & P(Y_1=1,Y_2=1\mid\beta_1,\beta_2)^ P(Y_1=0,Y_2=1\mid\beta_1,\beta_2)^ \\ pt& \qquad P(Y_1=1,Y_2=0\mid\beta_1,\beta_2)^ P(Y_1=0,Y_2=0\mid\beta_1,\beta_2)^ \Big) \end Substituting the latent variables Y_1^* and Y_2^* in the probability functions and taking logs gives : \begin \sum & \Big( Y_1Y_2 \ln P(\varepsilon_1>-X_1\beta_1,\varepsilon_2>-X_2\beta_2) \\ pt& \quad+(1-Y_1)Y_2\ln P(\varepsilon_1<-X_1\beta_1,\varepsilon_2>-X_2\beta_2) \\ pt& \quad+Y_1(1-Y_2)\ln P(\varepsilon_1>-X_1\beta_1,\varepsilon_2<-X_2\beta_2) \\ pt& \quad+(1-Y_1)(1-Y_2)\ln P(\varepsilon_1<-X_1\beta_1,\varepsilon_2<-X_2\beta_2) \Big). \end After some rewriting, the log-likelihood function becomes: : \begin \sum & \Big ( Y_1Y_2\ln \Phi(X_1\beta_1,X_2\beta_2,\rho) \\ pt& \quad + (1-Y_1)Y_2\ln \Phi(-X_1\beta_1,X_2\beta_2,-\rho) \\ pt& \quad + Y_1(1-Y_2)\ln \Phi(X_1\beta_1,-X_2\beta_2,-\rho) \\ pt& \quad +(1-Y_1)(1-Y_2)\ln \Phi(-X_1\beta_1,-X_2\beta_2,\rho) \Big). \end Note that \Phi is the
cumulative distribution function In probability theory and statistics, the cumulative distribution function (CDF) of a real-valued random variable X, or just distribution function of X, evaluated at x, is the probability that X will take a value less than or equal to x. Eve ...
of the
bivariate normal distribution In probability theory Probability theory is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by ex ...
. Y_1 and Y_2 in the log-likelihood function are observed variables being equal to one or zero.


Multivariate Probit

For the general case, \mathbf = (y_1, ..., y_j), \ (i = 1,...,N) where we can take j as choices and i as individuals or observations, the probability of observing choice \mathbf is : \begin \Pr(\mathbf, \mathbf, \Sigma) = & \int_\cdots\int_f_N(\mathbf^*_i, \mathbf, \Sigma) dy^*_1\dots dy^*_J \\ \Pr(\mathbf, \mathbf, \Sigma) = & \int \mathbb_ f_N(\mathbf^*_i, \mathbf, \Sigma) d\mathbf^*_i \end Where A = A_1 \times \cdots \times A_J and, : A_j = \begin (-\infty,0] & y_j = 0 \\ (0, \infty) & y_j = 1 \end The log-likelihood function in this case would be \sum_^N \log\Pr(\mathbf, \mathbf, \Sigma) Except for J\leq2 typically there is no closed form solution to the integrals in the log-likelihood equation. Instead simulation methods can be used to simulated the choice probabilities. Methods using importance sampling include the
GHK algorithm The GHK algorithm (Geweke, Hajivassiliou and Keane) is an importance sampling method for simulating choice probabilities in the multivariate probit model. These simulated probabilities can be used to recover parameter estimates from the maximize ...
(Geweke, Hajivassilou, McFadden and Keane), AR (accept-reject), Stern's method. There are also MCMC approaches to this problem including CRB (Chib's method with Rao-Blackwellization), CRT (Chib, Ritter, Tanner), ARK (accept-reject kernel), and ASK (adaptive sampling kernel). A variational approach scaling to large datasets is proposed in Probit-LMM (Mandt, Wenzel, Nakajima et al.).{{cite journal , first1=Stephan , last1=Mandt , first2=Florian , last2=Wenzel, first3=Shinichi, last3=Nakajima, first4=Cunningham , last4=John, first5=Christoph , last5=Lippert, first6=Marius , last6=Kloft, year=2017 , title=Sparse probit linear mixed model , journal=Machine Learning , volume=106 , issue=9–10 , pages=1–22 , url=https://link.springer.com/content/pdf/10.1007%2Fs10994-017-5652-6.pdf, doi=10.1007/s10994-017-5652-6, arxiv=1507.04777 , s2cid=11588006


References


Further reading

* Greene, William H., ''Econometric Analysis'', seventh edition, Prentice-Hall, 2012. Regression models