Šidák Correction For T-test
   HOME

TheInfoList



OR:

One of the application of
Student's t-test A ''t''-test is any statistical hypothesis test in which the test statistic follows a Student's ''t''-distribution under the null hypothesis. It is most commonly applied when the test statistic would follow a normal distribution if the value of ...
is to test the location of one sequence of
independent and identically distributed random variables In probability theory and statistics, a collection of random variables is independent and identically distributed if each random variable has the same probability distribution as the others and all are mutually independent. This property is us ...
. If we want to test the locations of multiple sequences of such variables,
Šidák correction In statistics, the Šidák correction, or Dunn–Šidák correction, is a method used to counteract the problem of multiple comparisons. It is a simple method to control the familywise error rate. When all null hypotheses are true, the method pro ...
should be applied in order to calibrate the level of the Student's t-test. Moreover, if we want to test the locations of nearly infinitely many sequences of variables, then Šidák correction should be used, but with caution. More specifically, the validity of Šidák correction depends on how fast the number of sequences goes to infinity.


Introduction

Suppose we are interested in different hypotheses, H_,...,H_ , and would like to check if all of them are true. Now the hypothesis test scheme becomes : H_ : all of H_ are true; : H_: at least one of H_ is false. Let \alpha be the level of this test (the type-I error), that is, the probability that we falsely reject H_ when it is true. We aim to design a test with certain level \alpha . Suppose when testing each hypothesis H_, the test statistic we use is t_. If these t_'s are independent, then a test for H_ can be developed by the following procedure, known as Šidák correction. :Step 1, we test each of null hypotheses at level 1-(1-\alpha)^\frac . :Step 2, if any of these null hypotheses is rejected, we reject H_ .


Finite case

For finitely many t-tests, suppose Y_=\mu_+\epsilon_, i=1,...,N, j=1,...,n, where for each , \epsilon_,...,\epsilon_ are independently and identically distributed, for each \epsilon_,...,\epsilon_ are independent but not necessarily identically distributed, and \epsilon_ has finite fourth moment. Our goal is to design a test for H_: \mu_=0, \forall i=1,...,N with level . This test can be based on the
t-statistic In statistics, the ''t''-statistic is the ratio of the departure of the estimated value of a parameter from its hypothesized value to its standard error. It is used in hypothesis testing via Student's ''t''-test. The ''t''-statistic is used in a ...
of each sequences, that is, : t_=\frac, where: : \bar_=\frac\sum_^Y_, \qquad S_^=\frac\sum_^(Y_-\bar_)^. Using Šidák correction, we reject H_ if any of the t-tests based on the t-statistics above reject at level 1-(1-\alpha)^. More specifically, we reject H_ when : \exists i \in \ : , t_, > \zeta_, where : P(, Z, >\zeta_)=1-(1-\alpha)^, \qquad Z\sim N(0,1) The test defined above has asymptotic level , because : \begin \text &= P_ \left (\text H_ \right) \\ &= P_ \left(\exists i \in \ : , t_, >\zeta_ \right ) \\ &= 1-P_ \left (\forall i \in \ : , t_, \leq\zeta_ \right ) \\ &=1-\prod_^P_ \left (, t_, \leq\zeta_ \right ) \\ &\to 1-\prod_^P \left (, Z_, \leq\zeta_ \right ) && Z_\sim N(0,1) \\ &=\alpha \end


Infinite case

In some cases, the number of sequences, N , increase as the data size of each sequences, n , increase. In particular, suppose N(n)\rightarrow \infty \text n \rightarrow \infty . If this is true, then we will need to test a null including infinitely many hypotheses, that is H_: \text H_ \text i=1,2,.... To design a test,
Šidák correction In statistics, the Šidák correction, or Dunn–Šidák correction, is a method used to counteract the problem of multiple comparisons. It is a simple method to control the familywise error rate. When all null hypotheses are true, the method pro ...
may be applied, as in the case of finitely many t-test. However, when N(n)\rightarrow \infty \text n\rightarrow \infty, the Šidák correction for t-test may not achieve the level we want, that is, the true level of the test may not converges to the nominal level \alpha as n goes to infinity. This result is related to
high-dimensional statistics In statistical theory, the field of high-dimensional statistics studies data whose dimension is larger than typically considered in classical multivariate analysis. The area arose owing to the emergence of many modern data sets in which the dimensi ...
and is proven by . Specifically, if we want the true level of the test converges to the nominal level \alpha , then we need a restraint on how fast N(n)\rightarrow \infty . Indeed, * When all of \epsilon_ have distribution symmetric about zero, then it is sufficient to require \log N = o (n^) to guarantee the true level converges to \alpha . * When the distributions of \epsilon_ are asymmetric, then it is necessary to impose \log N = o(n^) to ensure the true level converges to \alpha . * Actually, if we apply
bootstrapping In general, bootstrapping usually refers to a self-starting process that is supposed to continue or grow without external input. Etymology Tall boots may have a tab, loop or handle at the top known as a bootstrap, allowing one to use fingers ...
method to the calibration of level, then we will only need \log N = o (n^) even if \epsilon_ has asymmetric distribution. The results above are based on
Central Limit Theorem In probability theory, the central limit theorem (CLT) establishes that, in many situations, when independent random variables are summed up, their properly normalized sum tends toward a normal distribution even if the original variables themselv ...
. According to Central Limit Theorem, each of our t-statistics t_ possesses asymptotic standard normal distribution, and so the difference between the distribution of each t_ and the standard normal distribution is asymptotically negligible. The question is, if we aggregate all the differences between the distribution of each t_ and the standard normal distribution, is this aggregation of differences still asymptotically ignorable? When we have finitely many t_ , the answer is yes. But when we have infinitely many t_ , the answer some time becomes no. This is because in the latter case we are summing up infinitely many infinitesimal terms. If the number of the terms goes to infinity too fast, that is, N(n) \rightarrow \infty too fast, then the sum may not be zero, the distribution of the t-statistics can not be approximated by the standard normal distribution, the true level does not converges to the nominal level \alpha , and then the Šidák correction fails.


See also

*
Šidák correction In statistics, the Šidák correction, or Dunn–Šidák correction, is a method used to counteract the problem of multiple comparisons. It is a simple method to control the familywise error rate. When all null hypotheses are true, the method pro ...
*
Multiple comparisons In statistics, the multiple comparisons, multiplicity or multiple testing problem occurs when one considers a set of statistical inferences simultaneously or infers a subset of parameters selected based on the observed values. The more inferences ...
*
Bonferroni correction In statistics, the Bonferroni correction is a method to counteract the multiple comparisons problem. Background The method is named for its use of the Bonferroni inequalities. An extension of the method to confidence intervals was proposed by Oliv ...
*
Family-wise error rate In statistics, family-wise error rate (FWER) is the probability of making one or more false discoveries, or type I errors when performing multiple hypotheses tests. Familywise and Experimentwise Error Rates Tukey (1953) developed the concept of a ...
*
Closed testing procedure In statistics, the closed testing procedure is a general method for performing more than one hypothesis test simultaneously. The closed testing principle Suppose there are ''k'' hypotheses ''H''1,..., ''H'k'' to be tested and the overall type I ...


References

{{DEFAULTSORT:Sidak correction for t-test Multiple comparisons