Unsolved Problems In Statistics
   HOME

TheInfoList



OR:

There are many longstanding
unsolved problems in mathematics Many mathematical problems have been stated but not yet solved. These problems come from many areas of mathematics, such as theoretical physics, computer science, algebra, analysis, combinatorics, algebraic, differential, discrete and Euclid ...
for which a solution has still not yet been found. The notable unsolved problems in statistics are generally of a different flavor; according to John Tukey, "difficulties in identifying problems have delayed statistics far more than difficulties in solving problems." A list of "one or two open problems" (in fact 22 of them) was given by David Cox.


Inference and testing

* How to detect and correct for
systematic error Observational error (or measurement error) is the difference between a measured value of a quantity and its true value.Dodge, Y. (2003) ''The Oxford Dictionary of Statistical Terms'', OUP. In statistics, an error is not necessarily a "mistake ...
s, especially in sciences where
random error Observational error (or measurement error) is the difference between a measured value of a quantity and its true value.Dodge, Y. (2003) ''The Oxford Dictionary of Statistical Terms'', OUP. In statistics, an error is not necessarily a "mistake ...
s are large (a situation Tukey termed
uncomfortable science Uncomfortable science, as identified by statistician John Tukey, comprises situations in which there is a need to draw an inference from a limited sample of data, where further samples influenced by the same cause system will not be available. More ...
). * The Graybill–Deal estimator is often used to estimate the common mean of two normal populations with unknown and possibly unequal variances. Though this estimator is generally unbiased, its
admissibility Admissibility may refer to: Law * Admissible evidence, evidence which may be introduced in a court of law *Admissibility (ECHR), whether a case will be considered in the European Convention on Human Rights system Mathematics and logic * Admissible ...
remains to be shown. *
Meta-analysis A meta-analysis is a statistical analysis that combines the results of multiple scientific studies. Meta-analyses can be performed when there are multiple scientific studies addressing the same question, with each individual study reporting m ...
: Though independent
p-value In null-hypothesis significance testing, the ''p''-value is the probability of obtaining test results at least as extreme as the result actually observed, under the assumption that the null hypothesis is correct. A very small ''p''-value means ...
s can be combined using
Fisher's method In statistics, Fisher's method, also known as Fisher's combined probability test, is a technique for data fusion or "meta-analysis" (analysis of analyses). It was developed by and named for Ronald Fisher. In its basic form, it is used to combi ...
, techniques are still being developed to handle the case of dependent p-values. *
Behrens–Fisher problem In statistics, the Behrens–Fisher problem, named after Walter Behrens and Ronald Fisher, is the problem of interval estimation and hypothesis testing concerning the difference between the means of two normally distributed populations when the ...
:
Yuri Linnik Yuri Vladimirovich Linnik (russian: Ю́рий Влади́мирович Ли́нник; January 8, 1915 – June 30, 1972) was a Soviet mathematician active in number theory, probability theory and mathematical statistics. Linnik was born in ...
showed in 1966 that there is no
uniformly most powerful test In statistical hypothesis testing, a uniformly most powerful (UMP) test is a statistical hypothesis testing, hypothesis test which has the greatest Statistical power, power 1 - \beta among all possible tests of a given Type I and type II errors, si ...
for the difference of two means when the variances are unknown and possibly unequal. That is, there is no
exact test In statistics, an exact (significance) test is a test such that if the null hypothesis is true, then all assumptions made during the derivation of the distribution of the test statistic are met. Using an exact test provides a significance test t ...
(meaning that, if the means are in fact equal, one that rejects the
null hypothesis In scientific research, the null hypothesis (often denoted ''H''0) is the claim that no difference or relationship exists between two sets of data or variables being analyzed. The null hypothesis is that any experimentally observed difference is d ...
with probability exactly α) that is also the most powerful for all values of the variances (which are thus
nuisance parameter Nuisance (from archaic ''nocence'', through Fr. ''noisance'', ''nuisance'', from Lat. ''nocere'', "to hurt") is a common law tort. It means that which causes offence, annoyance, trouble or injury. A nuisance can be either public (also "common") ...
s). Though there are many approximate solutions (such as
Welch's t-test In statistics, Welch's ''t''-test, or unequal variances ''t''-test, is a two-sample location test which is used to test the hypothesis that two populations have equal means. It is named for its creator, Bernard Lewis Welch, is an adaptation of ...
), the problem continues to attract attention as one of the classic problems in statistics. *
Multiple comparisons In statistics, the multiple comparisons, multiplicity or multiple testing problem occurs when one considers a set of statistical inferences simultaneously or infers a subset of parameters selected based on the observed values. The more inferenc ...
: There are various ways to adjust p-values to compensate for the simultaneous or sequential testing of hypothesis. Of particular interest is how to simultaneously control the overall error rate, preserve statistical power, and incorporate the dependence between tests into the adjustment. These issues are especially relevant when the number of simultaneous tests can be very large, as is increasingly the case in the analysis of data from
DNA microarray A DNA microarray (also commonly known as DNA chip or biochip) is a collection of microscopic DNA spots attached to a solid surface. Scientists use DNA microarrays to measure the expression levels of large numbers of genes simultaneously or to ...
s. *
Bayesian statistics Bayesian statistics is a theory in the field of statistics based on the Bayesian interpretation of probability where probability expresses a ''degree of belief'' in an event. The degree of belief may be based on prior knowledge about the event, ...
: A list of open problems in Bayesian statistics has been proposed.


Experimental design

* As the theory of
Latin square In combinatorics and in experimental design, a Latin square is an ''n'' × ''n'' array filled with ''n'' different symbols, each occurring exactly once in each row and exactly once in each column. An example of a 3×3 Latin sq ...
s is a cornerstone in the
design of experiments The design of experiments (DOE, DOX, or experimental design) is the design of any task that aims to describe and explain the variation of information under conditions that are hypothesized to reflect the variation. The term is generally associ ...
, solving the problems in Latin squares could have immediate applicability to experimental design.


Problems of a more philosophical nature

* Sampling of species problem: How is a probability updated when there is unanticipated new data? *
Doomsday argument The Doomsday Argument (DA), or Carter catastrophe, is a probabilistic argument that claims to predict the future population of the human species, based on an estimation of the number of humans born to date. The Doomsday argument was originally p ...
: How valid is the probabilistic argument that claims to
predict A prediction (Latin ''præ-'', "before," and ''dicere'', "to say"), or forecasting, forecast, is a statement about a future event (probability theory), event or data. They are often, but not always, based upon experience or knowledge. There i ...
the
future The future is the time after the past and present. Its arrival is considered inevitable due to the existence of time and the laws of physics. Due to the apparent nature of reality and the unavoidability of the future, everything that currentl ...
lifetime of the
human race Humans (''Homo sapiens'') are the most abundant and widespread species of primate, characterized by bipedalism and exceptional cognitive skills due to a large and complex brain. This has enabled the development of advanced tools, culture, an ...
given only an estimate of the total number of humans born so far? *
Exchange paradox The two envelopes problem, also known as the exchange paradox, is a paradox in probability theory. It is of special interest in decision theory, and for the Bayesian interpretation of probability theory. It is a variant of an older problem kn ...
: Issues arise within the subjectivistic interpretation of
probability theory Probability theory is the branch of mathematics concerned with probability. Although there are several different probability interpretations, probability theory treats the concept in a rigorous mathematical manner by expressing it through a set o ...
; more specifically within
Bayesian decision theory In estimation theory and decision theory, a Bayes estimator or a Bayes action is an estimator or decision rule that minimizes the posterior expected value of a loss function (i.e., the posterior expected loss). Equivalently, it maximizes the pos ...
. This is still an open problem among the subjectivists as no consensus has been reached yet. Examples include: ** The
two envelopes problem The two envelopes problem, also known as the exchange paradox, is a paradox in probability theory. It is of special interest in decision theory, and for the Bayesian interpretation of probability theory. It is a variant of an older problem known ...
** The
necktie paradox The necktie paradox is a puzzle and paradox with a subjective interpretation of probability theory describing a paradoxical bet advantageous to both involved parties. The two-envelope paradox is a variation of the necktie paradox. Statement of ...
*
Sunrise problem The sunrise problem can be expressed as follows: "What is the probability that the sun will rise tomorrow?" The sunrise problem illustrates the difficulty of using probability theory when evaluating the plausibility of statements or beliefs. Acc ...
: What is the probability that the sun will rise tomorrow? Very different answers arise depending on the methods used and assumptions made.


Notes


References

* * {{unsolved problems Statistics
Unsolved problems List of unsolved problems may refer to several notable conjectures or open problems in various academic fields: Natural sciences, engineering and medicine * Unsolved problems in astronomy * Unsolved problems in biology * Unsolved problems in chem ...
*Statistics