statistics Statistics (from German language, German: ', "description of a State (polity), state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a s ...

, sequential analysis or sequential hypothesis testing is

statistical analysis Statistical inference is the process of using data analysis to infer properties of an underlying probability distribution.Upton, G., Cook, I. (2008) ''Oxford Dictionary of Statistics'', OUP. . Inferential statistical analysis infers properties of ...

where the

sample size Sample size determination or estimation is the act of choosing the number of observations or replicates to include in a statistical sample. The sample size is an important feature of any empirical study in which the goal is to make inferences abo ...

is not fixed in advance. Instead data is evaluated as it is collected, and further sampling is stopped in accordance with a pre-defined

stopping rule In probability theory, in particular in the study of stochastic processes, a stopping time (also Markov time, Markov moment, optional stopping time or optional time ) is a specific type of "random time": a random variable whose value is interpre ...

as soon as significant results are observed. Thus a conclusion may sometimes be reached at a much earlier stage than would be possible with more classical

hypothesis testing A statistical hypothesis test is a method of statistical inference used to decide whether the data provide sufficient evidence to reject a particular hypothesis. A statistical hypothesis test typically involves a calculation of a test statistic. T ...

estimation Estimation (or estimating) is the process of finding an estimate or approximation, which is a value that is usable for some purpose even if input data may be incomplete, uncertain, or unstable. The value is nonetheless usable because it is d ...

, at consequently lower financial and/or human cost.

History

The method of sequential analysis is first attributed to

Abraham Wald Abraham Wald (; ; , ; – ) was a Hungarian and American mathematician and statistician who contributed to decision theory, geometry and econometrics, and founded the field of sequential analysis. One of his well-known statistical works was ...

with Jacob Wolfowitz, W. Allen Wallis, and

Milton Friedman Milton Friedman (; July 31, 1912 – November 16, 2006) was an American economist and statistician who received the 1976 Nobel Memorial Prize in Economic Sciences for his research on consumption analysis, monetary history and theory and ...

while at Columbia University's Statistical Research Group as a tool for more efficient industrial

quality control Quality control (QC) is a process by which entities review the quality of all factors involved in production. ISO 9000 defines quality control as "a part of quality management focused on fulfilling quality requirements". This approach plac ...

during

World War II World War II or the Second World War (1 September 1939 – 2 September 1945) was a World war, global conflict between two coalitions: the Allies of World War II, Allies and the Axis powers. World War II by country, Nearly all of the wo ...

. Its value to the war effort was immediately recognised, and led to its receiving a "restricted"

classification Classification is the activity of assigning objects to some pre-existing classes or categories. This is distinct from the task of establishing the classes themselves (for example through cluster analysis). Examples include diagnostic tests, identif ...

. At the same time, George Barnard led a group working on optimal stopping in Great Britain. Another early contribution to the method was made by K.J. Arrow with D. Blackwell and M.A. Girshick. A similar approach was independently developed from first principles at about the same time by

Alan Turing Alan Mathison Turing (; 23 June 1912 – 7 June 1954) was an English mathematician, computer scientist, logician, cryptanalyst, philosopher and theoretical biologist. He was highly influential in the development of theoretical computer ...

, as part of the

Banburismus Banburismus was a Cryptanalysis, cryptanalytic process developed by Alan Turing at Bletchley Park in United Kingdom, Britain during the Second World War. It was used by Bletchley Park's Hut 8 to help break German ''Kriegsmarine'' (naval) message ...

technique used at

Bletchley Park Bletchley Park is an English country house and Bletchley Park estate, estate in Bletchley, Milton Keynes (Buckinghamshire), that became the principal centre of Allies of World War II, Allied World War II cryptography, code-breaking during the S ...

, to test hypotheses about whether different messages coded by German Enigma machines should be connected and analysed together. This work remained secret until the early 1980s. Peter Armitage introduced the use of sequential analysis in medical research, especially in the area of clinical trials. Sequential methods became increasingly popular in medicine following

Stuart Pocock Stuart J. Pocock is a British medical statistician. He has been professor of medical statistics at the London School of Hygiene & Tropical Medicine since 1989. His research interests include statistical methods for the design, monitoring, analysis ...

's work that provided clear recommendations on how to control

Type 1 error Type I error, or a false positive, is the erroneous rejection of a true null hypothesis in statistical hypothesis testing. A type II error, or a false negative, is the erroneous failure in bringing about appropriate rejection of a false null hyp ...

rates in sequential designs.

Alpha spending functions

When researchers repeatedly analyze data as more observations are added, the probability of a

increases. Therefore, it is important to adjust the alpha level at each interim analysis, such that the overall Type 1 error rate remains at the desired level. This is conceptually similar to using the

Bonferroni correction In statistics, the Bonferroni correction is a method to counteract the multiple comparisons problem. Background The method is named for its use of the Bonferroni inequalities. Application of the method to confidence intervals was described by ...

, but because the repeated looks at the data are dependent, more efficient corrections for the alpha level can be used. Among the earliest proposals is the

Pocock boundary The Pocock boundary is a method for determining whether to stop a clinical trial prematurely. The typical clinical trial compares two groups of patients. One group are given a placebo or conventional treatment, while the other group of patients are ...

. Alternative ways to control the Type 1 error rate exist, such as the Haybittle–Peto bounds, and additional work on determining the boundaries for interim analyses has been done by O’Brien & Fleming and Wang & Tsiatis. A limitation of corrections such as the Pocock boundary is that the number of looks at the data must be determined before the data is collected, and that the looks at the data should be equally spaced (e.g., after 50, 100, 150, and 200 patients). The alpha spending function approach developed by Demets & Lan does not have these restrictions, and depending on the parameters chosen for the spending function, can be very similar to Pocock boundaries or the corrections proposed by O'Brien and Fleming. Another approach that has no such restrictions at all is based on e-values and e-processes.

Applications of sequential analysis

Clinical trials

In a randomized trial with two treatment groups, group sequential testing may for example be conducted in the following manner: After n subjects in each group are available an interim analysis is conducted. A statistical test is performed to compare the two groups and if the

null hypothesis The null hypothesis (often denoted ''H''0) is the claim in scientific research that the effect being studied does not exist. The null hypothesis can also be described as the hypothesis in which no relationship exists between two sets of data o ...

is rejected the trial is terminated; otherwise, the trial continues, another n subjects per group are recruited, and the statistical test is performed again, including all subjects. If the null is rejected, the trial is terminated, and otherwise it continues with periodic evaluations until a maximum number of interim analyses have been performed, at which point the last statistical test is conducted and the trial is discontinued.

Other applications

Sequential analysis also has a connection to the problem of ''

gambler's ruin In statistics, gambler's ruin is the fact that a gambling, gambler playing a game with negative expected value will eventually go Bankruptcy, bankrupt, regardless of their betting system. The concept was initially stated: A persistent gambler wh ...

'' that has been studied by, among others, Huygens in 1657. Step detection is the process of finding abrupt changes in the mean level of a

time series In mathematics, a time series is a series of data points indexed (or listed or graphed) in time order. Most commonly, a time series is a sequence taken at successive equally spaced points in time. Thus it is a sequence of discrete-time data. ...

or signal. It is usually considered as a special kind of statistical method known as change point detection. Often, the step is small and the time series is corrupted by some kind of noise, and this makes the problem challenging because the step may be hidden by the noise. Therefore, statistical and/or signal processing algorithms are often required. When the algorithms are run ''online'' as the data is coming in, especially with the aim of producing an alert, this is an application of sequential analysis.

Bias

Trials that are terminated early because they reject the null hypothesis typically overestimate the true effect size. This is because in small samples, only large effect size estimates will lead to a significant effect, and the subsequent termination of a trial. Methods to correct effect size estimates in single trials have been proposed. Note that this bias is mainly problematic when interpreting single studies. In meta-analyses, overestimated effect sizes due to early stopping are balanced by underestimation in trials that stop late, leading Schou & Marschner to conclude that "early stopping of clinical trials is not a substantive source of bias in meta-analyses". The meaning of p-values in sequential analyses also changes, because when using sequential analyses, more than one analysis is performed, and the typical definition of a p-value as the data “at least as extreme” as is observed needs to be redefined. One solution is to order the p-values of a series of sequential tests based on the time of stopping and how high the test statistic was at a given look, which is known as stagewise ordering, first proposed by Armitage.

Notes

References

* * Bartroff, J., Lai T.L., and Shih, M.-C. (2013) Sequential Experimentation in Clinical Trials: Design and Analysis. Springer. * * * * Bakeman, R., Gottman, J.M., (1997) Observing Interaction: An Introduction to Sequential Analysis, Cambridge: Cambridge University Press * Jennison, C. and Turnbull, B.W (2000) Group Sequential Methods With Applications to Clinical Trials. Chapman & Hall/CRC. * Whitehead, J. (1997). The Design and Analysis of Sequential Clinical Trials, 2nd Edition. John Wiley & Sons.

External links

R Package: Wald's Sequential Probability Ratio Test
b
OnlineMarketr.com

Software for conducting sequential analysis
an

in the study of group interaction in computer-mediated communication by Dr. Allan Jeong at Florida State University
SAMBO Optimization
– a Python framework for sequential, model-based optimization. ;Commercial *

PASS Sample Size Software PASS is a computer program for estimating sample size or determining the power of a statistical test or confidence interval. NCSS LLC is the company that produces PASS. NCSS LLC also produces NCSS (for statistical analysis). PASS includes over ...

includes features for the setup of group sequential designs. {{Statistics Statistical hypothesis testing Design of experiments Sequential methods