statistical hypothesis testing A statistical hypothesis test is a method of statistical inference used to decide whether the data provide sufficient evidence to reject a particular hypothesis. A statistical hypothesis test typically involves a calculation of a test statistic. T ...

, a two-sample test is a test performed on the data of two

random sample In this statistics, quality assurance, and survey methodology, sampling is the selection of a subset or a statistical sample (termed sample for short) of individuals from within a statistical population to estimate characteristics of the whole ...

s, each

independent Independent or Independents may refer to: Arts, entertainment, and media Artist groups * Independents (artist group), a group of modernist painters based in Pennsylvania, United States * Independentes (English: Independents), a Portuguese artist ...

ly obtained from a different given

population Population is a set of humans or other organisms in a given region or area. Governments conduct a census to quantify the resident population size within a given jurisdiction. The term is also applied to non-human animals, microorganisms, and pl ...

. The purpose of the test is to determine whether the difference between these two populations is

statistically significant In statistical hypothesis testing, a result has statistical significance when a result at least as "extreme" would be very infrequent if the null hypothesis were true. More precisely, a study's defined significance level, denoted by \alpha, is the ...

. There are a large number of statistical tests that can be used in a two-sample test. Which one(s) are appropriate depend on a variety of factors, such as: * Which assumptions (if any) may be made ''a priori'' about the

distribution Distribution may refer to: Mathematics *Distribution (mathematics), generalized functions used to formulate solutions of partial differential equations *Probability distribution, the probability of a particular value or value range of a varia ...

s from which the data have been sampled? For example, in many situations it may be assumed that the underlying distributions are

normal distribution In probability theory and statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is f(x) = \frac ...

s. In other cases the data are categorical, coming from a

discrete distribution In probability theory and statistics, a probability distribution is a function that gives the probabilities of occurrence of possible events for an experiment. It is a mathematical description of a random phenomenon in terms of its sample spac ...

over a

nominal scale Level of measurement or scale of measure is a classification that describes the nature of information within the values assigned to variables. Psychologist Stanley Smith Stevens developed the best-known classification with four levels, or scale ...

, such as which entry was selected from a menu. * Does the hypothesis being tested apply to the distributions as a whole, or just some population parameter, for example the

mean A mean is a quantity representing the "center" of a collection of numbers and is intermediate to the extreme values of the set of numbers. There are several kinds of means (or "measures of central tendency") in mathematics, especially in statist ...

or the

variance In probability theory and statistics, variance is the expected value of the squared deviation from the mean of a random variable. The standard deviation (SD) is obtained as the square root of the variance. Variance is a measure of dispersion ...

? * Is the

hypothesis A hypothesis (: hypotheses) is a proposed explanation for a phenomenon. A scientific hypothesis must be based on observations and make a testable and reproducible prediction about reality, in a process beginning with an educated guess o ...

being tested merely that there is a difference in the relevant population characteristics (in which case a two-sided test may be indicated), or does it involve a specific bias ("A is better than B"), so that a one-sided test can be used?

Relevant tests

Statistical tests that may apply for two-sample testing include: * Hotelling's T-squared distribution#Two-sample statistic * Kernel embedding of distributions#Kernel two-sample test * Kolmogorov–Smirnov_test#Two-sample_Kolmogorov–Smirnov_test *

Kuiper's test Kuiper's test is used in statistics to test whether a data sample comes from a given distribution (one-sample Kuiper test), or whether two data samples came from the same unknown distribution (two-sample Kuiper test). It is named after Dutch math ...

Median test The median test (also Mood’s median-test, Westenberg-Mood median test or Brown-Mood median test) is a special case of Pearson's chi-squared test. It is a nonparametric test that tests the null hypothesis that the medians of the populations from ...

Pearson's chi-squared test Pearson's chi-squared test or Pearson's \chi^2 test is a statistical test applied to sets of categorical data to evaluate how likely it is that any observed difference between the sets arose by chance. It is the most widely used of many chi-squa ...

* Student's t-test#Two-sample_t-tests * Welch's t-test * Tukey–Duckworth test *

Mann–Whitney U test The Mann–Whitney U test (also called the Mann–Whitney–Wilcoxon (MWW/MWU), Wilcoxon rank-sum test, or Wilcoxon–Mann–Whitney test) is a nonparametric statistical test of the null hypothesis that randomly selected values ''X'' and ''Y'' f ...

* Two-proportion Z-test

Relevant tests

See also