Set Identifiable
   HOME

TheInfoList



OR:

In
statistics Statistics (from German language, German: ''wikt:Statistik#German, Statistik'', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of ...
and econometrics, set identification (or partial identification) extends the concept of identifiability (or "point identification") in
statistical model A statistical model is a mathematical model that embodies a set of statistical assumptions concerning the generation of Sample (statistics), sample data (and similar data from a larger Statistical population, population). A statistical model repres ...
s to situations where the distribution of observable variables is not informative of the exact value of a parameter, but instead constrains the parameter to lie in a strict subset of the parameter space. Statistical models that are set identified arise in a variety of settings in economics, including
game theory Game theory is the study of mathematical models of strategic interactions among rational agents. Myerson, Roger B. (1991). ''Game Theory: Analysis of Conflict,'' Harvard University Press, p.&nbs1 Chapter-preview links, ppvii–xi It has appli ...
and the Rubin causal model. Though the use of set identification dates to a 1934 article by Ragnar Frisch, the methods were significantly developed and promoted by Charles Manski starting in the 1990s. Manski developed a method of worst-case bounds for accounting for
selection bias Selection bias is the bias introduced by the selection of individuals, groups, or data for analysis in such a way that proper randomization is not achieved, thereby failing to ensure that the sample obtained is representative of the population int ...
. Unlike methods that make additional statistical assumptions, such as Heckman correction, the worst-case bounds rely only on the data to generate a range of supported parameter values.


Definition

Let \mathcal=\ be a
statistical model A statistical model is a mathematical model that embodies a set of statistical assumptions concerning the generation of Sample (statistics), sample data (and similar data from a larger Statistical population, population). A statistical model repres ...
where the parameter space \Theta is either finite- or infinite-dimensional. Suppose \theta_0 is the true parameter value. We say that \theta_0 is set identified if there exists \theta \in \Theta such that P_\theta \neq P_; that is, that some parameter values in \Theta are not
observationally equivalent Observational equivalence is the property of two or more underlying entities being indistinguishable on the basis of their observable implications. Thus, for example, two scientific theories are observationally equivalent if all of their empirically ...
to \theta_0. In that case, the identified set is the set of parameter values that are observationally equivalent to \theta_0.


Example: missing data

This example is due to . Suppose there are two
binary random variable Binary data is data whose unit can take on only two possible states. These are often labelled as 0 and 1 in accordance with the binary numeral system and Boolean algebra. Binary data occurs in many different technical and scientific fields, wher ...
s, and . The econometrician is interested in \mathrm P(Y = 1). There is a missing data problem, however: can only be observed if Z = 1. By the law of total probability, :\mathrm P(Y = 1) = \mathrm P(Y = 1 \mid Z = 1) \mathrm P(Z = 1) + \mathrm P(Y = 1 \mid Z = 0) \mathrm P(Z = 0). The only unknown object is \mathrm P(Y = 1 \mid Z = 0), which is constrained to lie between 0 and 1. Therefore, the identified set is :\Theta_I = \. Given the missing data constraint, the econometrician can only say that \mathrm P(Y = 1) \in \Theta_I. This makes use of all available information.


Statistical inference

Set estimation cannot rely on the usual tools for statistical inference developed for point estimation. A literature in statistics and econometrics studies methods for
statistical inference Statistical inference is the process of using data analysis to infer properties of an underlying probability distribution, distribution of probability.Upton, G., Cook, I. (2008) ''Oxford Dictionary of Statistics'', OUP. . Inferential statistical ...
in the context of set-identified models, focusing on constructing
confidence interval In frequentist statistics, a confidence interval (CI) is a range of estimates for an unknown parameter. A confidence interval is computed at a designated ''confidence level''; the 95% confidence level is most common, but other levels, such as 9 ...
s or
confidence region In statistics, a confidence region is a multi-dimensional generalization of a confidence interval. It is a set of points in an ''n''-dimensional space, often represented as an ellipsoid around a point which is an estimated solution to a problem, al ...
s with appropriate properties. For example, a method developed by (and which describes as complicated) constructs confidence regions that cover the identified set with a given probability.


Notes


References

* * *


Further reading

* * * *{{Cite book, publisher = Springer-Verlag, isbn = 978-0-387-00454-9, last = Manski, first = Charles F., author-link = Charles Manski , title = Partial Identification of Probability Distributions, location = New York, date = 2003 Econometric modeling Estimation theory