In
statistics, sequential analysis or sequential hypothesis testing is
statistical analysis where the
sample size is not fixed in advance. Instead data are evaluated as they are collected, and further sampling is stopped in accordance with a pre-defined
stopping rule
In probability theory, in particular in the study of stochastic processes, a stopping time (also Markov time, Markov moment, optional stopping time or optional time ) is a specific type of “random time”: a random variable whose value is inter ...
as soon as significant results are observed. Thus a conclusion may sometimes be reached at a much earlier stage than would be possible with more classical
hypothesis testing or
estimation
Estimation (or estimating) is the process of finding an estimate or approximation, which is a value that is usable for some purpose even if input data may be incomplete, uncertain, or unstable. The value is nonetheless usable because it is de ...
, at consequently lower financial and/or human cost.
History
The method of sequential analysis is first attributed to
Abraham Wald
Abraham Wald (; hu, Wald Ábrahám, yi, אברהם וואַלד; – ) was a Jewish Hungarian mathematician who contributed to decision theory, geometry, and econometrics and founded the field of statistical sequential analysis. One o ...
with
Jacob Wolfowitz,
W. Allen Wallis
Wilson Allen Wallis (November 5, 1912 – October 12, 1998) was an American economist and statistician who served as president of the University of Rochester. He is best known for the Kruskal–Wallis one-way analysis of variance, which is nam ...
, and
Milton Friedman while at
Columbia University's Statistical Research Group The Statistical Research Group (SRG) was a research group at Columbia University focused on military problems during World War II. Abraham Wald, Allen Wallis, Herbert Solomon, Frederick Mosteller, George Stigler and Milton Friedman were all part o ...
as a tool for more efficient industrial
quality control during
World War II
World War II or the Second World War, often abbreviated as WWII or WW2, was a world war that lasted from 1939 to 1945. It involved the World War II by country, vast majority of the world's countries—including all of the great power ...
. Its value to the war effort was immediately recognised, and led to its receiving a "restricted"
classification. At the same time,
George Barnard led a group working on optimal stopping in Great Britain. Another early contribution to the method was made by
K.J. Arrow with
D. Blackwell and M.A. Girshick.
A similar approach was independently developed from first principles at about the same time by
Alan Turing
Alan Mathison Turing (; 23 June 1912 – 7 June 1954) was an English mathematician, computer scientist, logician, cryptanalyst, philosopher, and theoretical biologist. Turing was highly influential in the development of theoretical c ...
, as part of the
Banburismus technique used at
Bletchley Park, to test hypotheses about whether different messages coded by German
Enigma machines should be connected and analysed together. This work remained secret until the early 1980s.
Peter Armitage introduced the use of sequential analysis in medical research, especially in the area of clinical trials. Sequential methods became increasingly popular in medicine following
Stuart Pocock
Stuart J. Pocock is a British medical statistician. He has been professor of medical statistics at the London School of Hygiene and Tropical Medicine since 1989. His research interests include statistical methods for the design, monitoring, analys ...
's work that provided clear recommendations on how to control
Type 1 error rates in sequential designs.
Alpha spending functions
When researchers repeatedly analyze data as more observations are added, the probability of a
Type 1 error increases. Therefore, it is important to adjust the alpha level at each interim analysis, such that the overall Type 1 error rate remains at the desired level. This is conceptually similar to using the
Bonferroni correction, but because the repeated looks at the data are dependent, more efficient corrections for the alpha level can be used. Among the earliest proposals is the
Pocock boundary. Alternative ways to control the Type 1 error rate exist, such as the
Haybittle-Peto bounds, and additional work on determining the boundaries for interim analyses has been done by O’Brien & Fleming and Wang & Tsiatis.
A limitation of corrections such as the Pocock boundary is that the number of looks at the data must be determined before the data is collected, and that the looks at the data should be equally spaced (e.g., after 50, 100, 150, and 200 patients). The alpha spending function approach developed by Demets & Lan does not have these restrictions, and depending on the parameters chosen for the spending function, can be very similar to Pocock boundaries or the corrections proposed by O'Brien and Fleming.
Applications of sequential analysis
Clinical trials
In a randomized trial with two treatment groups, group sequential testing may for example be conducted in the following manner: After n subjects in each group are available an interim analysis is conducted. A statistical test is performed to compare the two groups and if the
null hypothesis
In scientific research, the null hypothesis (often denoted ''H''0) is the claim that no difference or relationship exists between two sets of data or variables being analyzed. The null hypothesis is that any experimentally observed difference is d ...
is rejected the trial is terminated; otherwise, the trial continues, another n subjects per group are recruited, and the statistical test is performed again, including all subjects. If the null is rejected, the trial is terminated, and otherwise it continues with periodic evaluations until a maximum number of interim analyses have been performed, at which point the last statistical test is conducted and the trial is discontinued.
Other applications
Sequential analysis also has a connection to the problem of ''
gambler's ruin'' that has been studied by, among others,
Huygens
Huygens (also Huijgens, Huigens, Huijgen/Huygen, or Huigen) is a Dutch patronymic surname, meaning "son of Hugo". Most references to "Huygens" are to the polymath Christiaan Huygens. Notable people with the surname include:
* Jan Huygen (1563– ...
in 1657.
Step detection is the process of finding abrupt changes in the mean level of a
time series
In mathematics, a time series is a series of data points indexed (or listed or graphed) in time order. Most commonly, a time series is a sequence taken at successive equally spaced points in time. Thus it is a sequence of discrete-time data. E ...
or signal. It is usually considered as a special kind of statistical method known as
change point detection
In statistical analysis, change detection or change point detection tries to identify times when the probability distribution of a stochastic process or time series changes. In general the problem concerns both detecting whether or not a chang ...
. Often, the step is small and the time series is corrupted by some kind of noise, and this makes the problem challenging because the step may be hidden by the noise. Therefore, statistical and/or signal processing algorithms are often required. When the algorithms are run ''online'' as the data is coming in, especially with the aim of producing an alert, this is an application of sequential analysis.
Bias
Trials that are terminated early because they reject the null hypothesis typically overestimate the true effect size.
This is because in small samples, only large effect size estimates will lead to a significant effect, and the subsequent termination of a trial. Methods to correct effect size estimates in single trials have been proposed. Note that this bias is mainly problematic when interpreting single studies. In meta-analyses, overestimated effect sizes due to early stopping are balanced by underestimation in trials that stop late, leading Schou & Marschner to conclude that "early stopping of clinical trials is not a substantive source of bias in meta-analyses".
The meaning of p-values in sequential analyses also changes, because when using sequential analyses, more than one analysis is performed, and the typical definition of a p-value as the data “at least as extreme” as is observed needs to be redefined. One solution is to order the p-values of a series of sequential tests based on the time of stopping and how high the test statistic was at a given look, which is known as stagewise ordering,
first proposed by
Armitage.
See also
*
Optimal stopping
*
Sequential estimation
In statistics, sequential estimation refers to estimation methods in sequential analysis where the sample size is not fixed in advance. Instead, data is evaluated as it is collected, and further sampling is stopped in accordance with a predefin ...
*
Sequential probability ratio test
*
CUSUM
Notes
References
*
* Bartroff, J., Lai T.L., and Shih, M.-C. (2013) Sequential Experimentation in Clinical Trials: Design and Analysis. Springer.
*
*
*
* Bakeman, R., Gottman, J.M., (1997) Observing Interaction: An Introduction to Sequential Analysis, Cambridge: Cambridge University Press
* Jennison, C. and Turnbull, B.W (2000) Group Sequential Methods With Applications to Clinical Trials. Chapman & Hall/CRC.
* Whitehead, J. (1997). The Design and Analysis of Sequential Clinical Trials, 2nd Edition. John Wiley & Sons.
External links
R Package: Wald's Sequential Probability Ratio Testb
OnlineMarketr.comSoftware for conducting sequential analysisan
in the study of group interaction in computer-mediated communication by Dr. Allan Jeong at Florida State University
;Commercial
*
PASS Sample Size Software includes features for the setup of group sequential designs.
{{Statistics
Statistical hypothesis testing
Design of experiments
Sequential methods