Up-and-down designs (UDDs) are a family of statistical experiment designs used in dose-finding experiments in science, engineering, and medical research. Dose-finding experiments have ''binary responses'': each individual outcome can be described as one of two possible values, such as success vs. failure or toxic vs. non-toxic. Mathematically the binary responses are coded as 1 and 0. The goal of dose-finding experiments is to estimate the strength of treatment (i.e., the "dose") that would trigger the "1" response a pre-specified proportion of the time. This dose can be envisioned as a

percentile In statistics, a ''k''-th percentile (percentile score or centile) is a score ''below which'' a given percentage ''k'' of scores in its frequency distribution falls (exclusive definition) or a score ''at or below which'' a given percentage fal ...

of the distribution of response thresholds. An example where dose-finding is used is in an experiment to estimate the LD₅₀ of some toxic chemical with respect to mice. UpAndDownFig1

Dose-finding designs are sequential and response-adaptive: the dose at a given point in the experiment depends upon previous outcomes, rather than be fixed ''a priori''. Dose-finding designs are generally more efficient for this task than fixed designs, but their properties are harder to analyze, and some require specialized design software. UDDs use a discrete set of doses rather than vary the dose continuously. They are relatively simple to implement, and are also among the best understood dose-finding designs. Despite this simplicity, UDDs generate

random walk In mathematics, a random walk is a random process that describes a path that consists of a succession of random steps on some mathematical space. An elementary example of a random walk is the random walk on the integer number line \mathbb Z ...

s with intricate properties. The original UDD aimed to find the median threshold by increasing the dose one level after a "0" response, and decreasing it one level after a "1" response. Hence the name "up-and-down". Other UDDs break this symmetry in order to estimate percentiles other than the median, or are able to treat groups of subjects rather than one at a time. UDDs were developed in the 1940s by several research groups independently. The 1950s and 1960s saw rapid diversification with UDDs targeting percentiles other than the median, and expanding into numerous applied fields. The 1970s to early 1990s saw little UDD methods research, even as the design continued to be used extensively. A revival of UDD research since the 1990s has provided deeper understanding of UDDs and their properties, and new and better estimation methods. UDDs are still used extensively in the two applications for which they were originally developed: psychophysics where they are used to estimate sensory thresholds and are often known as fixed forced-choice staircase procedures, and explosive sensitivity testing, where the median-targeting UDD is often known as the Bruceton test. UDDs are also very popular in toxicity and anesthesiology research. They are also considered a viable choice for Phase I clinical trials.

Mathematical description

Definition

Let

n

be the sample size of a UDD experiment, and assuming for now that subjects are treated one at a time. Then the doses these subjects receive, denoted as random variables

X_1,\ldots,X_n

, are chosen from a discrete, finite set of

M

increasing ''dose levels''

\mathcal=\left\.

Furthermore, if

X_i=d_m

, then

X_\in\,

according to simple constant rules based on recent responses. The next subject must be treated one level up, one level down, or at the same level as the current subject. The responses themselves are denoted

Y_1,\ldots,Y_n \in\left\;

hereafter the "1" responses are positive and "0" negative. The repeated application of the same rules (known as ''dose-transition rules'') over a finite set of dose levels, turns

X_1,\ldots,X_n

into a random walk over

\mathcal

. Different dose-transition rules produce different UDD "flavors", such as the three shown in the figure above. Despite the experiment using only a discrete set of dose levels, the dose-magnitude variable itself,

x

, is assumed to be continuous, and the probability of positive response is assumed to increase continuously with increasing

x

. The goal of dose-finding experiments is to estimate the dose

x

(on a continuous scale) that would trigger positive responses at a pre-specified target rate

\Gamma=P\left\, \ \ \Gamma\in(0,1)

; often known as the "target dose". This problem can be also expressed as estimation of the

quantile In statistics and probability, quantiles are cut points dividing the range of a probability distribution into continuous intervals with equal probabilities, or dividing the observations in a sample in the same way. There is one fewer quantile th ...

F^(\Gamma)

of a cumulative distribution function describing the dose-toxicity curve

F(x)

. The

density function In probability theory, a probability density function (PDF), or density of a continuous random variable, is a function whose value at any given sample (or point) in the sample space (the set of possible values taken by the random variable) can ...

f(x)

associated with

F(x)

is interpretable as the distribution of ''response thresholds'' of the population under study.

Transition probability matrix

Given that a subject receives dose

d_m

, denote the probability that the next subject receives dose

d_,d_m

, or

d_

, as

p_,p_

p_

, respectively. These ''transition probabilities'' obey the constraints

p_+p_+p_=1

and the boundary conditions

p_=p_=0

. Each specific set of UDD rules enables the symbolic calculation of these probabilities, usually as a function of

F(x)

. Assuming that transition probabilities are fixed in time, depending only upon the current allocation and its outcome, i.e., upon

\left(X_i,Y_i\right)

and through them upon

F(x)

(and possibly on a set of fixed parameters). The probabilities are then best represented via a tri-diagonal transition probability matrix (TPM)

\mathbf

\bf=\left(
\begin
  p_& p_ & 0 & \cdots & \cdots & 0 \\
  p_ & p_ & p_ & 0 & \ddots & \vdots \\
  0 & \ddots & \ddots & \ddots & \ddots & \vdots \\
  \vdots & \ddots & \ddots & \ddots & \ddots & 0 \\
  \vdots & \ddots & 0 & p_ & p_ & p_ \\
  0 & \cdots & \cdots & 0 & p_ & p_\\
\end
\right).

Balance point

Usually, UDD dose-transition rules bring the dose down (or at least bar it from escalating) after positive responses, and vice versa. Therefore, UDD random walks have a central tendency: dose assignments tend to meander back and forth around some dose

x^*

that can be calculated from the transition rules, when those are expressed as a function of

F(x)

. This dose has often been confused with the experiment's formal target

F^(\Gamma)

, and the two are often identical - but they do not have to be. The target is the dose that the experiment is tasked with estimating, while

x^*

, known as the "balance point", is approximately where the UDD's random walk revolves around.

Stationary distribution of dose allocations

Since UDD random walks are regular

Markov chains A Markov chain or Markov process is a stochastic model describing a sequence of possible events in which the probability of each event depends only on the state attained in the previous event. Informally, this may be thought of as, "What happen ...

, they generate a

stationary distribution Stationary distribution may refer to: * A special distribution for a Markov chain such that if the chain starts with its stationary distribution, the marginal distribution of all states at any time will always be the stationary distribution. Assum ...

of dose allocations,

\pi

, once the effect of the manually-chosen starting dose wears off. This means, long-term visit frequencies to the various doses will approximate a steady state described by

\pi

. According to Markov chain theory the starting-dose effect wears off rather quickly, at a geometric rate. Numerical studies suggest that it would typically take between

2/M

and

4/M

subjects for the effect to wear off nearly completely.

\pi

is also the

asymptotic distribution In mathematics and statistics, an asymptotic distribution is a probability distribution that is in a sense the "limiting" distribution of a sequence of distributions. One of the main uses of the idea of an asymptotic distribution is in providing ...

of cumulative dose allocations. UDDs' central tendencies ensure that long-term, the most frequently visited dose (i.e., the

mode Mode ( la, modus meaning "manner, tune, measure, due measure, rhythm, melody") may refer to: Arts and entertainment * '' MO''D''E (magazine)'', a defunct U.S. women's fashion magazine * ''Mode'' magazine, a fictional fashion magazine which is ...

\pi

) will be one of the two doses closest to the balance point

x^*

. If

x^*

is outside the range of allowed doses, then the mode will be on the boundary dose closest to it. Under the original median-finding UDD, the mode will be at the closest dose to

x^*

in any case. Away from the mode, asymptotic visit frequencies decrease sharply, at a faster-than-geometric rate. Even though a UDD experiment is still a random walk, long excursions away from the region of interest are very unlikely.

Common UDDs

Original ("simple" or "classical") UDD

The original "simple" or "classical" UDD moves the dose up one level upon a negative response, and vice versa. Therefore, the transition probabilities are

\begin
p_&=P\=1-F(d_m);\\
p_&=P\=F(d_m).
\end

We use the original UDD as an example for calculating the balance point

x^*

. The design's 'up', 'down' functions are

p(x)=1-F(x),q(x)=F(x).

We equate them to find

F^*

1-F^*=F^*\ \longrightarrow \   F^*=0.5.

The "classical" UDD is designed to find the median threshold. This is a case where

F^*=\Gamma.

The "classical" UDD can be seen as a special case of each of the more versatile designs described below.

Durham and Flournoy's biased coin design

This UDD shifts the balance point, by adding the option of treating the next subject at the same dose rather than move only up or down. Whether to stay is determined by a random toss of a metaphoric "coin" with probability

b=P\.

This biased-coin design (BCD) has two "flavors", one for

F^*>0.5

and one for

F^*<0.5,

whose rules are shown below:

X_ =
\begin
d_ & \textrm\ \ Y_i=0\ \ \&\ \ \textrm;\\
d_ & \textrm\ \  Y\_i=1;\\
d_m & \textrm\ \  Y_i=0\ \ \& \ \ \textrm.\\
\end

The heads probability

b

can take any value in

,1 /math>. The balance point is \begin
    b\left(1-F^*\right) &=& F^*\\
    F^* &=& \frac\in,0.5 \end The BCD balance point can made identical to a target rate F^(\Gamma) by setting the heads probability to b=\Gamma/(1-\Gamma) . For example, for \Gamma=0.3 set b=3/7 . Setting b=1 makes this design identical to the classical UDD, and inverting the rules by imposing the coin toss upon positive rather than negative outcomes, produces above-median balance points. Versions with two coins, one for each outcome, have also been published, but they do not seem to offer an advantage over the simpler single-coin BCD.

Group (cohort) UDDs

Some dose-finding experiments, such as phase I trials, require a waiting period of weeks before determining each individual outcome. It may preferable then, to be able treat several subjects at once or in rapid succession. With group UDDs, the transition rules apply rules to cohorts of fixed size

s

rather than to individuals.

X_i

becomes the dose given to cohort

i

, and

Y_i

is the number of positive responses in the

i

-th cohort, rather than a binary outcome. Given that the

i

-th cohort is treated at

X_i=d_m

on the interior of

\mathcal

the

i+1

-th cohort is assigned to

X_=
\begin
d_ &\textrm\ \ Y_i\le l;\\
d_ &\textrm\ \ Y_i\ge u;\\
d_m &\textrm\ \ Y_i Y_i follow a binomial distribution conditional on X_i, with parameters s and F(X_i) . The up and down probabilities are the binomial distribution's tails, and the stay probability its center (it is zero if u=l+1). A specific choice of parameters can be abbreviated as GUD_. Nominally, group UDDs generate s -order random walks, since the s most recent observations are needed to determine the next allocation. However, with cohorts viewed as single mathematical entities, these designs generate a first-order random walk having a tri-diagonal TPM as above. Some relevant group UDD subfamilies:

* Symmetric designs with l+u=s (e.g., GUD_) target the median. 
* The family GUD_, encountered in toxicity studies, allows escalation only with zero positive responses, and de-escalate upon any positive response. The escalation probability at x is \left(1-F(x)\right)^s, and since this design does not allow for remaining at the same dose, at the balance point it will be exactly 1/2 . Therefore, F^*=1-\left(\frac \right)^. With s=2,3,4 would be associated with F^*\approx  0.293,0.206 and 0.159, respectively. The mirror-image family GUD_has its balance points at one minus these probabilities.

For general group UDDs, the balance point can be calculated only numerically, by finding the dose x^* with toxicity rate F^* such that \sum_^s
\left(\begin
s\\
r\\
\end\right) \left(F^*\right)^r(1-F^*)^=
\sum_^
\left(\begin
s\\
t\\
\end\right) \left(F^*\right)^t(1-F^*)^. Any numerical root-finding algorithm, e.g.,

Newton–Raphson In numerical analysis, Newton's method, also known as the Newton–Raphson method, named after Isaac Newton and Joseph Raphson, is a root-finding algorithm which produces successively better approximations to the roots (or zeroes) of a real-va ...

, can be used to solve for

F^*

$k$ -in-a-row (or "transformed" or "geometric") UDD

This is the most commonly used non-median UDD. It was introduced by Wetherill in 1963, and proliferated by him and colleagues shortly thereafter to psychophysics, where it remains one of the standard methods to find sensory thresholds. Wetherill called it "transformed" UDD; Misrak Gezmu who was the first to analyze its random-walk properties, called it "Geometric" UDD in the 1990s; and in the 2000s the more straightforward name "

k

-in-a-row" UDD was adopted. The design's rules are deceptively simple:

X_=
 \begin
d_ &\textrm\ \ Y_=\cdots=Y_i=0,\ \  \textrm\ \textrm\ \textrm\ \ d_m;\\
d_ &\textrm\ \ Y_i=1; \\
d_m &\textrm,
 \end

Every dose escalation requires

k

non-toxicities observed on consecutive data points, all at the current dose, while de-escalation only requires a single toxicity. It closely resembles GUD

_

described above, and indeed shares the same balance point. The difference is that

k

-in-a-row can bail out of a dose level upon the first toxicity, whereas its group UDD sibling might treat the entire cohort at once, and therefore might see more than one toxicity before descending. The method used in sensory studies is actually the mirror-image of the one defined above, with

k

successive responses required for a de-escalation and only one non-response for escalation, yielding

F^*\approx 0.707,0.794,0.841,\ldots

for

k=2,3,4,\ldots

k

-in-a-row generates a

k

-th order random walk because knowledge of the last

k

responses might be needed. It can be represented as a first-order chain with

Mk

states, or as a Markov chain with

M

levels, each having

k

''internal states'' labeled

0

k-1

The internal state serves as a counter of the number of immediately recent consecutive non-toxicities observed at the current dose. This description is closer to the physical dose-allocation process, because subjects at different internal states of the level

m

, are all assigned the same dose

d_m

. Either way, the TPM is

Mk\times Mk

(or more precisely,

\left M-1)k+1)\right times \left M-1)k+1)\right /math>, because the internal counter is meaningless at the highest dose) - and it is not tridiagonal.

Here is the expanded k -in-a-row TPM with k=2 and M=5, using the abbreviation F_m\equiv F\left(d_m\right). Each level's internal states are adjacent to each other.

: \begin
    F_1  & 1-F_1  & 0  & 0  & 0 & 0  & 0  & 0 & 0\\
    F_1  & 0 & 1-F_1  & 0  & 0  & 0   & 0  & 0 & 0\\
    F_2  & 0  & 0 & 1-F_2  & 0  & 0  & 0 & 0  & 0  \\
    F_2  & 0  & 0 & 0 & 1-F_2  & 0  & 0  & 0   & 0  \\
     0  & 0 & F_3 & 0 & 0 &1-F_3  & 0  & 0  & 0   \\
     0  & 0 & F_3 & 0 & 0 & 0 & 1-F_3  & 0  & 0   \\
     0  & 0 & 0 & 0 & F_4 & 0 & 0 &1-F_4  & 0  \\
     0  & 0 & 0 & 0 & F_4 & 0 & 0 & 0 & 1-F_4  \\
     0 & 0 & 0  & 0 & 0 & 0 & F_5 &  0 & 1-F_5  \\
\end. k -in-a-row is often considered for clinical trials targeting a low-toxicity dose. In this case, the balance point and the target are not identical; rather, k is chosen to aim close to the target rate, e.g., k=2 for studies targeting the 30th percentile, and k=3 for studies targeting the 20th percentile.

Estimating the target dose

Unlike other design approaches, UDDs do not have a specific estimation method "bundled in" with the design as a default choice. Historically, the more common choice has been some weighted average of the doses administered, usually excluding the first few doses to mitigate the starting-point bias. This approach antedates deeper understanding of UDDs' Markov properties, but its success in numerical evaluations relies upon the eventual sampling from

\pi

, since the latter is centered roughly around

x^*.

The single most popular among these ''averaging estimators'' was introduced by Wetherill et al. in 1966, and only includes ''reversal points'' (points where the outcome switches from 0 to 1 or vice versa) in the average. In recent years, the limitations of averaging estimators have come to light, in particular the many sources of bias that are very difficult to mitigate. Reversal estimators suffer from both multiple biases (although there is some inadvertent cancelling out of biases), and increased variance due to using a subsample of doses. However, the knowledge about averaging-estimator limitations has yet to disseminate outside the methodological literature and affect actual practice. By contrast, ''regression estimators'' attempt to approximate the curve

y=F(x)

describing the dose-response relationship, in particular around the target percentile. The raw data for the regression are the doses

d_m

on the horizontal axis, and the observed toxicity frequencies,

\hat_m=\frac,\ m=1,\ldots,M,

on the vertical axis. The target estimate is the

abscissa In common usage, the abscissa refers to the (''x'') coordinate and the ordinate refers to the (''y'') coordinate of a standard two-dimensional graph. The distance of a point from the y-axis, scaled with the x-axis, is called abscissa or x coo ...

of the point where the fitted curve crosses

y=\Gamma.

Probit regression In statistics, a probit model is a type of regression where the dependent variable can take only two values, for example married or not married. The word is a portmanteau, coming from ''probability'' + ''unit''. The purpose of the model is to e ...

has been used for many decades to estimate UDD targets, although far less commonly than the reversal-averaging estimator. In 2002, Stylianou and Flournoy introduced an interpolated version of

isotonic regression In statistics and numerical analysis, isotonic regression or monotonic regression is the technique of fitting a free-form line to a sequence of observations such that the fitted line is non-decreasing (or non-increasing) everywhere, and lies as ...

(IR) to estimate UDD targets and other dose-response data. More recently, a modification called "centered isotonic regression" (CIR) was developed by Oron and Flournoy, promising substantially better estimation performance than ordinary isotonic regression in most cases, and also offering the first viable interval estimator for isotonic regression in general. Isotonic regression estimators appear to be the most compatible with UDDs, because both approaches are nonparametric and relatively robust. The publicly available R package "cir" implements both CIR and IR for dose-finding and other applications.

References

{{Statistics, collection Design of experiments Statistical process control