HOME

TheInfoList



OR:

In
statistics Statistics (from German language, German: ''wikt:Statistik#German, Statistik'', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of ...
, Hoeffding's test of independence, named after
Wassily Hoeffding Wassily Hoeffding (June 12, 1914 – February 28, 1991) was a Finnish statistician and probabilist. Hoeffding was one of the founders of nonparametric statistics, in which Hoeffding contributed the idea and basic results on U-statistics. In pro ...
, is a test based on the population measure of deviation from independence :H = \int (F_-F_1F_2)^2 \, dF_ where F_ is the
joint distribution function Given two random variables that are defined on the same probability space, the joint probability distribution is the corresponding probability distribution on all possible pairs of outputs. The joint distribution can just as well be considere ...
of two random variables, and F_1 and F_2 are their
marginal distribution In probability theory and statistics, the marginal distribution of a subset of a collection of random variables is the probability distribution of the variables contained in the subset. It gives the probabilities of various values of the variables ...
functions. Hoeffding derived an
unbiased estimator In statistics, the bias of an estimator (or bias function) is the difference between this estimator's expected value and the true value of the parameter being estimated. An estimator or decision rule with zero bias is called ''unbiased''. In stat ...
of H that can be used to test for
independence Independence is a condition of a person, nation, country, or state in which residents and population, or some portion thereof, exercise self-government, and usually sovereignty, over its territory. The opposite of independence is the statu ...
, and is
consistent In classical deductive logic, a consistent theory is one that does not lead to a logical contradiction. The lack of contradiction can be defined in either semantic or syntactic terms. The semantic definition states that a theory is consistent i ...
for any continuous
alternative Alternative or alternate may refer to: Arts, entertainment and media * Alternative (''Kamen Rider''), a character in the Japanese TV series ''Kamen Rider Ryuki'' * ''The Alternative'' (film), a 1978 Australian television film * ''The Alternative ...
. The test should only be applied to data drawn from a
continuous distribution In probability theory and statistics, a probability distribution is the mathematical Function (mathematics), function that gives the probabilities of occurrence of different possible outcomes for an Experiment (probability theory), experiment. ...
, since H has a defect for discontinuous F_, namely that it is not necessarily zero when F_=F_1F_2. This drawback can be overcome by taking an
integration Integration may refer to: Biology *Multisensory integration *Path integration * Pre-integration complex, viral genetic material used to insert a viral genome into a host genome *DNA integration, by means of site-specific recombinase technology, ...
with respect to dF_1F_2. This modified measure is known as Blum–Kiefer–Rosenblatt coefficient. A paper published in 2008Wilding, G.E., Mudholkar, G.S. (2008) "Empirical approximations for Hoeffding's test of bivariate independence using two Weibull extensions", ''Statistical Methodology'', 5 (2), 160-–170 describes both the calculation of a sample based version of this measure for use as a test statistic, and calculation of the null distribution of this test statistic.


See also

*
Correlation In statistics, correlation or dependence is any statistical relationship, whether causal or not, between two random variables or bivariate data. Although in the broadest sense, "correlation" may indicate any type of association, in statistics ...
*
Kendall's tau In statistics, the Kendall rank correlation coefficient, commonly referred to as Kendall's τ coefficient (after the Greek letter τ, tau), is a statistic used to measure the ordinal association between two measured quantities. A τ test is a ...
*
Spearman's rank correlation coefficient In statistics, Spearman's rank correlation coefficient or Spearman's ''ρ'', named after Charles Spearman and often denoted by the Greek letter \rho (rho) or as r_s, is a nonparametric measure of rank correlation ( statistical dependence between ...
*
Distance correlation In statistics and in probability theory, distance correlation or distance covariance is a measure of dependence between two paired random vectors of arbitrary, not necessarily equal, dimension. The population distance correlation coefficient is ze ...


References


Primary sources

* Wassily Hoeffding, A non-parametric test of independence, ''Annals of Mathematical Statistics'' 19: 293–325, 1948.
JSTOR
* Hollander and Wolfe, Non-parametric statistical methods (Section 8.7), 1999. Wiley. Covariance and correlation Nonparametric statistics Statistical tests {{statistics-stub