Accelerated failure time model
   HOME

TheInfoList



OR:

In the statistical area of
survival analysis Survival analysis is a branch of statistics for analyzing the expected duration of time until one event occurs, such as death in biological organisms and failure in mechanical systems. This topic is called reliability theory or reliability analysi ...
, an accelerated failure time model (AFT model) is a parametric model that provides an alternative to the commonly used
proportional hazards models Proportional hazards models are a class of survival models in statistics. Survival models relate the time that passes, before some event occurs, to one or more covariates that may be associated with that quantity of time. In a proportional haza ...
. Whereas a proportional hazards model assumes that the effect of a
covariate Dependent and independent variables are variables in mathematical modeling, statistical modeling and experimental sciences. Dependent variables receive this name because, in an experiment, their values are studied under the supposition or deman ...
is to multiply the hazard by some constant, an AFT model assumes that the effect of a covariate is to accelerate or decelerate the life course of a disease by some constant. This is especially appealing in a technical context where the 'disease' is a result of some mechanical process with a known sequence of intermediary stages.


Model specification

In full generality, the accelerated failure time model can be specified as :: \lambda(t, \theta)=\theta\lambda_0(\theta t) where \theta denotes the joint effect of covariates, typically \theta=\exp(- beta_1X_1 + \cdots + \beta_pX_p. (Specifying the regression coefficients with a negative sign implies that high values of the covariates ''increase'' the survival time, but this is merely a sign convention; without a negative sign, they increase the hazard.) This is satisfied if the
probability density function In probability theory, a probability density function (PDF), or density of a continuous random variable, is a function whose value at any given sample (or point) in the sample space (the set of possible values taken by the random variable) ca ...
of the event is taken to be f(t, \theta)=\theta f_0(\theta t); it then follows for the
survival function The survival function is a function that gives the probability that a patient, device, or other object of interest will survive past a certain time. The survival function is also known as the survivor function or reliability function. The te ...
that S(t, \theta)=S_0(\theta t). From this it is easy to see that the moderated life time T is distributed such that T\theta and the unmoderated life time T_0 have the same distribution. Consequently, \log(T) can be written as :: \log(T)=-\log(\theta)+\log(T\theta):=-\log(\theta)+\epsilon where the last term is distributed as \log(T_0), i.e., independently of \theta. This reduces the accelerated failure time model to
regression analysis In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships between a dependent variable (often called the 'outcome' or 'response' variable, or a 'label' in machine learning parlance) and one ...
(typically a
linear model In statistics, the term linear model is used in different ways according to the context. The most common occurrence is in connection with regression models and the term is often taken as synonymous with linear regression model. However, the term ...
) where -\log(\theta) represents the fixed effects, and \epsilon represents the noise. Different distributions of \epsilon imply different distributions of T_0, i.e., different baseline distributions of the survival time. Typically, in survival-analytic contexts, many of the observations are censored: we only know that T_i>t_i, not T_i=t_i. In fact, the former case represents survival, while the later case represents an event/death/censoring during the follow-up. These right-censored observations can pose technical challenges for estimating the model, if the distribution of T_0 is unusual. The interpretation of \theta in accelerated failure time models is straightforward: \theta=2 means that everything in the relevant life history of an individual happens twice as fast. For example, if the model concerns the development of a tumor, it means that all of the pre-stages progress twice as fast as for the unexposed individual, implying that the expected time until a clinical disease is 0.5 of the baseline time. However, this does not mean that the hazard function \lambda(t, \theta) is always twice as high - that would be the
proportional hazards model Proportional hazards models are a class of survival models in statistics. Survival models relate the time that passes, before some event occurs, to one or more covariates that may be associated with that quantity of time. In a proportional haza ...
.


Statistical issues

Unlike proportional hazards models, in which
Cox Cox may refer to: * Cox (surname), including people with the name Companies * Cox Enterprises, a media and communications company ** Cox Communications, cable provider ** Cox Media Group, a company that owns television and radio stations ** ...
's semi-parametric proportional hazards model is more widely used than parametric models, AFT models are predominantly fully parametric i.e. a probability distribution is specified for \log(T_0). (Buckley and James proposed a semi-parametric AFT but its use is relatively uncommon in applied research; in a 1992 paper, Wei pointed out that the Buckley–James model has no theoretical justification and lacks robustness, and reviewed alternatives.) This can be a problem, if a degree of realistic detail is required for modelling the distribution of a baseline lifetime. Hence, technical developments in this direction would be highly desirable. Unlike proportional hazards models, the regression parameter estimates from AFT models are robust to omitted
covariate Dependent and independent variables are variables in mathematical modeling, statistical modeling and experimental sciences. Dependent variables receive this name because, in an experiment, their values are studied under the supposition or deman ...
s. They are also less affected by the choice of probability distribution. The results of AFT models are easily interpreted. For example, the results of a
clinical trial Clinical trials are prospective biomedical or behavioral research studies on human participants designed to answer specific questions about biomedical or behavioral interventions, including new treatments (such as novel vaccines, drugs, diet ...
with mortality as the endpoint could be interpreted as a certain percentage increase in future
life expectancy Life expectancy is a statistical measure of the average time an organism is expected to live, based on the year of its birth, current age, and other demographic factors like sex. The most commonly used measure is life expectancy at birth ...
on the new treatment compared to the control. So a patient could be informed that he would be expected to live (say) 15% longer if he took the new treatment.
Hazard ratio In survival analysis, the hazard ratio (HR) is the ratio of the hazard rates corresponding to the conditions characterised by two distinct levels of a treatment variable of interest. For example, in a clinical study of a drug, the treated populati ...
s can prove harder to explain in layman's terms.


Distributions used in AFT models

The
log-logistic distribution In probability and statistics, the log-logistic distribution (known as the Fisk distribution in economics) is a continuous probability distribution for a non-negative random variable. It is used in survival analysis as a parametric model for eve ...
provides the most commonly used AFT model. Unlike the
Weibull distribution In probability theory and statistics, the Weibull distribution is a continuous probability distribution. It is named after Swedish mathematician Waloddi Weibull, who described it in detail in 1951, although it was first identified by Maurice Re ...
, it can exhibit a non-
monotonic In mathematics, a monotonic function (or monotone function) is a function between ordered sets that preserves or reverses the given order. This concept first arose in calculus, and was later generalized to the more abstract setting of ord ...
hazard function which increases at early times and decreases at later times. It is somewhat similar in shape to the
log-normal distribution In probability theory, a log-normal (or lognormal) distribution is a continuous probability distribution of a random variable whose logarithm is normally distributed. Thus, if the random variable is log-normally distributed, then has a norma ...
but it has heavier tails. The log-logistic cumulative distribution function has a simple closed form, which becomes important computationally when fitting data with censoring. For the censored observations one needs the survival function, which is the complement of the cumulative distribution function, i.e. one needs to be able to evaluate S(t, \theta)=1-F(t, \theta). The
Weibull distribution In probability theory and statistics, the Weibull distribution is a continuous probability distribution. It is named after Swedish mathematician Waloddi Weibull, who described it in detail in 1951, although it was first identified by Maurice Re ...
(including the exponential distribution as a special case) can be parameterised as either a proportional hazards model or an AFT model, and is the only family of distributions to have this property. The results of fitting a Weibull model can therefore be interpreted in either framework. However, the biological applicability of this model may be limited by the fact that the hazard function is monotonic, i.e. either decreasing or increasing. Other distributions suitable for AFT models include the log-normal, gamma and
inverse Gaussian distribution In probability theory, the inverse Gaussian distribution (also known as the Wald distribution) is a two-parameter family of continuous probability distributions with support on (0,∞). Its probability density function is given by : f(x;\mu, ...
s, although they are less popular than the log-logistic, partly as their cumulative distribution functions do not have a closed form. Finally, the
generalized gamma distribution The generalized gamma distribution is a continuous probability distribution with two shape parameters (and a scale parameter). It is a generalization of the gamma distribution which has one shape parameter (and a scale parameter). Since many dis ...
is a three-parameter distribution that includes the
Weibull Weibull is a Swedish locational surname. The Weibull family share the same roots as the Danish / Norwegian noble family of Falsenbr>They originated from and were named after the village of Weiböl in Widstedts parish, Jutland, but settled in Skà ...
, log-normal and gamma distributions as special cases.


References


Further reading

* * * * * * Martinussen, Torben; Scheike, Thomas (2006), Dynamic Regression Models for Survival Data, Springer, * Bagdonavicius, Vilijandas; Nikulin, Mikhail (2002), Accelerated Life Models. Modeling and Statistical Analysis, Chapman&Hall/CRC, {{Statistics, analysis Survival analysis