In
statistics
Statistics (from German language, German: ''wikt:Statistik#German, Statistik'', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of ...
, the Behrens–Fisher problem, named after
Walter Behrens and
Ronald Fisher
Sir Ronald Aylmer Fisher (17 February 1890 – 29 July 1962) was a British polymath who was active as a mathematician, statistician, biologist, geneticist, and academic. For his work in statistics, he has been described as "a genius who a ...
, is the problem of
interval estimation
In statistics, interval estimation is the use of sample data to estimate an '' interval'' of plausible values of a parameter of interest. This is in contrast to point estimation, which gives a single value.
The most prevalent forms of interval es ...
and
hypothesis testing
A statistical hypothesis test is a method of statistical inference used to decide whether the data at hand sufficiently support a particular hypothesis.
Hypothesis testing allows us to make probabilistic statements about population parameters.
...
concerning the difference between the means of two
normally distributed populations when the
variance
In probability theory and statistics, variance is the expectation of the squared deviation of a random variable from its population mean or sample mean. Variance is a measure of dispersion, meaning it is a measure of how far a set of numbers ...
s of the two populations are not assumed to be equal, based on two
independent
Independent or Independents may refer to:
Arts, entertainment, and media Artist groups
* Independents (artist group), a group of modernist painters based in the New Hope, Pennsylvania, area of the United States during the early 1930s
* Independ ...
samples.
Specification
One difficulty with discussing the Behrens–Fisher problem and proposed solutions, is that there are many different interpretations of what is meant by "the Behrens–Fisher problem". These differences involve not only what is counted as being a relevant solution, but even the basic statement of the context being considered.
Context
Let ''X''
1, ..., ''X''
''n'' and ''Y''
1, ..., ''Y''
''m'' be
i.i.d.
In probability theory and statistics, a collection of random variables is independent and identically distributed if each random variable has the same probability distribution as the others and all are mutually independent. This property is us ...
samples from two populations which both come from the same
location–scale family of distributions. The scale parameters are assumed to be unknown and not necessarily equal, and the problem is to assess whether the location parameters can reasonably be treated as equal. Lehmann states that "the Behrens–Fisher problem" is used both for this general form of model when the family of distributions is arbitrary and for when the restriction to a
normal distribution
In statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is
:
f(x) = \frac e^
The parameter \mu ...
is made. While Lehmann discusses a number of approaches to the more general problem, mainly based on nonparametrics, most other sources appear to use "the Behrens–Fisher problem" to refer only to the case where the distribution is assumed to be normal: most of this article makes this assumption.
Requirements of solutions
Solutions to the Behrens–Fisher problem have been presented that make use of either a
classical or a
Bayesian inference
Bayesian inference is a method of statistical inference in which Bayes' theorem is used to update the probability for a hypothesis as more evidence or information becomes available. Bayesian inference is an important technique in statistics, a ...
point of view and either solution would be notionally invalid judged from the other point of view. If consideration is restricted to classical statistical inference only, it is possible to seek solutions to the inference problem that are simple to apply in a practical sense, giving preference to this simplicity over any inaccuracy in the corresponding probability statements. Where exactness of the significance levels of statistical tests is required, there may be an additional requirement that the procedure should make maximum use of the statistical information in the dataset. It is well known that an exact test can be gained by randomly discarding data from the larger dataset until the sample sizes are equal, assembling data in pairs and taking differences, and then using an ordinary
t-test
A ''t''-test is any statistical hypothesis testing, statistical hypothesis test in which the test statistic follows a Student's t-distribution, Student's ''t''-distribution under the null hypothesis. It is most commonly applied when the test stati ...
to test for the mean-difference being zero: clearly this would not be "optimal" in any sense.
The task of specifying interval estimates for this problem is one where a frequentist approach fails to provide an exact solution, although some approximations are available. Standard Bayesian approaches also fail to provide an answer that can be expressed as straightforward simple formulae, but modern computational methods of Bayesian analysis do allow essentially exact solutions to be found. Thus study of the problem can be used to elucidate the differences between the frequentist and Bayesian approaches to interval estimation.
Outline of different approaches
Behrens and Fisher approach
Ronald Fisher
Sir Ronald Aylmer Fisher (17 February 1890 – 29 July 1962) was a British polymath who was active as a mathematician, statistician, biologist, geneticist, and academic. For his work in statistics, he has been described as "a genius who a ...
in 1935 introduced
fiducial inference
Fiducial inference is one of a number of different types of statistical inference. These are rules, intended for general application, by which conclusions can be drawn from samples of data. In modern statistical practice, attempts to work with ...
in order to apply it to this problem. He referred to an earlier paper by
Walter Ulrich Behrens from 1929. Behrens and Fisher proposed to find the
probability distribution
In probability theory and statistics, a probability distribution is the mathematical function that gives the probabilities of occurrence of different possible outcomes for an experiment. It is a mathematical description of a random phenomenon i ...
of
:
where
and
are the two
sample mean
The sample mean (or "empirical mean") and the sample covariance are statistics computed from a Sample (statistics), sample of data on one or more random variables.
The sample mean is the average value (or mean, mean value) of a sample (statistic ...
s, and ''s''
1 and ''s''
2 are their
standard deviation
In statistics, the standard deviation is a measure of the amount of variation or dispersion of a set of values. A low standard deviation indicates that the values tend to be close to the mean (also called the expected value) of the set, while ...
s. See
Behrens–Fisher distribution
In statistics, the Behrens–Fisher distribution, named after Ronald Fisher and Walter Behrens, is a parameterized family of probability distributions arising from the solution of the Behrens–Fisher problem proposed first by Behrens and severa ...
. Fisher approximated the distribution of this by ignoring the random variation of the relative sizes of the standard deviations,
:
Fisher's solution provoked controversy because it did not have the property that the hypothesis of equal means would be
rejected with probability α if the means were in fact equal. Many other methods of treating the problem have been proposed since, and the effect on the resulting confidence intervals have been investigated.
Welch's approximate t solution
A widely used method is that of
B. L. Welch, who, like Fisher, was at
University College London
, mottoeng = Let all come who by merit deserve the most reward
, established =
, type = Public research university
, endowment = £143 million (2020)
, budget = ...
. The variance of the mean difference
:
results in
:
Welch (1938) approximated the distribution of
by the Type III
Pearson distribution
The Pearson distribution is a family of continuous probability distribution, continuous probability distributions. It was first published by Karl Pearson in 1895 and subsequently extended by him in 1901 and 1916 in a series of articles on biostat ...
(a scaled
chi-squared distribution
In probability theory and statistics, the chi-squared distribution (also chi-square or \chi^2-distribution) with k degrees of freedom is the distribution of a sum of the squares of k independent standard normal random variables. The chi-squa ...
) whose first two
moments agree with that of
. This applies to the following number of degrees of freedom (d.f.), which is generally non-integer:
:
Under the null hypothesis of equal expectations, , the distribution of the Behrens–Fisher statistic ''T'', which also depends on the variance ratio ''σ''
12/''σ''
22, could now be approximated by
Student's t distribution
In probability and statistics, Student's ''t''-distribution (or simply the ''t''-distribution) is any member of a family of continuous probability distributions that arise when estimating the mean of a normally distributed population in situa ...
with these ''ν'' degrees of freedom. But this ''ν'' contains the population variances ''σ
i''
2, and these are unknown. The following estimate only replaces the population variances by the sample variances:
:
This
is a random variable. A t distribution with a random number of degrees of freedom does not exist. Nevertheless, the Behrens–Fisher ''T'' can be compared with a corresponding quantile of
Student's t distribution
In probability and statistics, Student's ''t''-distribution (or simply the ''t''-distribution) is any member of a family of continuous probability distributions that arise when estimating the mean of a normally distributed population in situa ...
with these estimated numbers of degrees of freedom,
, which is generally non-integer. In this way, the boundary between acceptance and rejection region of the test statistic ''T'' is calculated based on the empirical variances ''s
i''
2, in a way that is a smooth function of these.
This method also does not give exactly the nominal rate, but is generally not too far off. However, if the population variances are equal, or if the samples are rather small and the population variances can be assumed to be approximately equal, it is more accurate to use
Student's t-test
A ''t''-test is any statistical hypothesis test in which the test statistic follows a Student's ''t''-distribution under the null hypothesis. It is most commonly applied when the test statistic would follow a normal distribution if the value of ...
.
Exact Method: Te Test
Te test
is to deal with the famous Behrens–Fisher problem, i.e., comparing the difference between the means of two normally distributed populations when the variances of the two populations are not assumed to be equal, based on two independent samples.
Te test is developed as an
Exact test
In statistics, an exact (significance) test is a test such that if the null hypothesis is true, then all assumptions made during the derivation of the distribution of the test statistic are met. Using an exact test provides a significance test th ...
, which allows for unequal sample sizes and unequal variances of two populations. The exact property still hold even with small extremely small and unbalanced sample size (e.g.
).
The Te statistic to test whether the means are different can be calculated as follows:
Let
and
be the i.i.d. sample vectors (
) from
and
separately.
Let
be an
orthogonal matrix whose elements of the first row are all
, similarly, let
be the first n rows of an
orthogonal matrix (whose elements of the first row are all
).
Then
is an n-dimensional normal random vector.
From the above distribution we see that
Other approaches
A number of different approaches to the general problem have been proposed, some of which claim to "solve" some version of the problem. Among these are,
[
:*that of Chapman in 1950,
:*that of Prokof’yev and Shishkin in 1974,
:*that of Dudewicz and Ahmed in 1998.
:*that of Chang Wang in 2022.]
In Dudewicz’s comparison of selected methods,[Dudewicz, Ma, Mai, and Su (2007)] it was found that the Dudewicz–Ahmed procedure is recommended for practical use.
Exact solutions to the common and generalized Behrens–Fisher problems
For several decades, it was commonly believed that no exact solution to the common Behrens–Fisher problem existed. However, it was proved in 1966 that it has an exact solution. In 2018 the probability density function of a generalized Behrens–Fisher distribution of ''m'' means and ''m'' distinct standard errors from ''m'' samples of distinct sizes from independent normal distributions with distinct means and variances was proved and the paper also examined its asymptotic approximations. A follow-up paper showed that the classic paired ''t''-test is a central Behrens–Fisher problem with a non-zero population correlation coefficient and derived its corresponding probability density function by solving its associated non-central Behrens–Fisher problem with a nonzero population correlation coefficient. It also solved a more general non-central Behrens–Fisher problem with a non-zero population correlation coefficient in the appendix.
Variants
A minor variant of the Behrens–Fisher problem has been studied. In this instance the problem is, assuming that the two population-means are in fact the same, to make inferences about the common mean: for example, one could require a confidence interval
In frequentist statistics, a confidence interval (CI) is a range of estimates for an unknown parameter. A confidence interval is computed at a designated ''confidence level''; the 95% confidence level is most common, but other levels, such as 9 ...
for the common mean.
Generalisations
One generalisation of the problem involves multivariate normal distribution
In probability theory and statistics, the multivariate normal distribution, multivariate Gaussian distribution, or joint normal distribution is a generalization of the one-dimensional (univariate) normal distribution to higher dimensions. One d ...
s with unknown covariance matrices, and is known as the multivariate Behrens–Fisher problem In statistics, the multivariate Behrens–Fisher problem is the problem of testing for the equality of means from two multivariate normal distributions when the covariance matrices are unknown and possibly not equal. Since this is a generalization ...
.[Belloni & Didier (2008)]
The nonparametric
Nonparametric statistics is the branch of statistics that is not based solely on parametrized families of probability distributions (common examples of parameters are the mean and variance). Nonparametric statistics is based on either being dist ...
Behrens–Fisher problem does not assume that the distributions are normal. Tests include the Cucconi test
In statistics, the Cucconi test is a nonparametric test for jointly comparing central tendency and variability (detecting location and scale changes) in two samples. Many rank tests have been proposed for the two-sample location-scale problem. N ...
of 1968 and the Lepage test In statistics, the Lepage test is an exactly distribution-free test (nonparametric test) for jointly monitoring the location (central tendency) and scale (Statistical variability, variability) in two-sample treatment versus control comparisons. This ...
of 1971.
Notes
References
*
*
*
*
*
*
*
*
*
*Lehmann, E. L. (1975) ''Nonparametrics: Statistical Methods Based on Ranks'', Holden-Day , McGraw-Hill
* Ruben, H. (200
"A simple conservative and robust solution of the Behrens–Fisher problem"
'' Sankhyā:The Indian Journal of Statistics'', Series A, 64 (1),139–155.
*
*
*
*
*
*
External links
* Dong, B.L. (2004
The Behrens–Fisher Problem: An Empirical Likelihood Approach
Econometrics Working Paper EWP0404, University of Victoria
{{DEFAULTSORT:Behrens-Fisher Problem
Mathematical problems
Statistical hypothesis testing