Predictions of surgery duration (SD) are used to schedule planned/elective surgeries so that utilization rate of operating theatres be optimized (maximized subject to policy constraints). An example for a constraint is that a pre-specified tolerance for the percentage of postponed surgeries (due to non-available operating room (OR) or recovery room space) not be exceeded. The tight linkage between SD prediction and surgery

scheduling A schedule or a timetable, as a basic time-management tool, consists of a list of times at which possible task (project management), tasks, events, or actions are intended to take place, or of a sequence of events in the chronological order ...

is the reason that most often scientific research related to scheduling methods addresses also SD predictive methods and ''vice versa''. Durations of surgeries are known to have large variability. Therefore, SD predictive methods attempt, on the one hand, to reduce variability (via ''

stratification Stratification may refer to: Mathematics * Stratification (mathematics), any consistent assignment of numbers to predicate symbols * Data stratification in statistics Earth sciences * Stable and unstable stratification * Stratification, or str ...

'' and '' covariates'', as detailed later), and on the other employ best available methods to produce SD predictions. The more accurate the predictions, the better the scheduling of surgeries (in terms of the required OR utilization optimization). An SD predictive method would ideally deliver a predicted SD statistical distribution (specifying the distribution and estimating its parameters). Once SD distribution is completely specified, various desired types of information could be extracted thereof, for example, the most probable duration (mode), or the probability that SD does not exceed a certain threshold value. In less ambitious circumstance, the predictive method would at least predict some of the basic properties of the distribution, like location and

scale parameter In probability theory and statistics, a scale parameter is a special kind of numerical parameter of a parametric family of probability distributions. The larger the scale parameter, the more spread out the distribution. Definition If a family o ...

s (mean,

median In statistics and probability theory, the median is the value separating the higher half from the lower half of a data sample, a population, or a probability distribution. For a data set, it may be thought of as "the middle" value. The basic fe ...

, mode,

standard deviation In statistics, the standard deviation is a measure of the amount of variation or dispersion of a set of values. A low standard deviation indicates that the values tend to be close to the mean (also called the expected value) of the set, while ...

coefficient of variation In probability theory and statistics, the coefficient of variation (CV), also known as relative standard deviation (RSD), is a standardized measure of dispersion of a probability distribution or frequency distribution. It is often expressed as ...

, CV). Certain desired

percentile In statistics, a ''k''-th percentile (percentile score or centile) is a score ''below which'' a given percentage ''k'' of scores in its frequency distribution falls (exclusive definition) or a score ''at or below which'' a given percentage falls ...

s of the distribution may also be the objective of estimation and prediction. Experts estimates, empirical

histogram A histogram is an approximate representation of the distribution of numerical data. The term was first introduced by Karl Pearson. To construct a histogram, the first step is to " bin" (or "bucket") the range of values—that is, divide the ent ...

s of the distribution (based on historical computer records), data mining and

knowledge discovery Knowledge extraction is the creation of Knowledge representation and reasoning, knowledge from structured (relational databases, XML) and unstructured (text (literary theory), text, documents, images) sources. The resulting knowledge needs to be in ...

techniques often replace the ideal objective of fully specifying SD theoretical distribution. Reducing SD variability prior to prediction (as alluded to earlier) is commonly regarded as part and parcel of SD predictive method. Most probably, SD has, in addition to random variation, also a systematic component, namely, SD distribution may be affected by various related factors (like medical specialty, patient condition or age, professional experience and size of medical team, number of surgeries a surgeon has to perform in a shift, type of anesthetic administered). Accounting for these factors (via stratification or covariates) would diminish SD variability and enhance the accuracy of the predictive method. Incorporating expert estimates (like those of surgeons) in the predictive model may also contribute to diminish the uncertainty of data-based SD prediction. Often, statistically significant covariates (also related to as factors, predictors or explanatory variables) — are first identified (for example, via simple techniques like

linear regression In statistics, linear regression is a linear approach for modelling the relationship between a scalar response and one or more explanatory variables (also known as dependent and independent variables). The case of one explanatory variable is call ...

and

), and only later more advanced

big-data Though used sometimes loosely partly because of a lack of formal definition, the interpretation that seems to best describe Big data is the one associated with large body of information that we could not comprehend when used only in smaller am ...

techniques are employed, like

Artificial Intelligence Artificial intelligence (AI) is intelligence—perceiving, synthesizing, and inferring information—demonstrated by machines, as opposed to intelligence displayed by animals and humans. Example tasks in which this is done include speech re ...

and

Machine Learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...

, to produce the final prediction. Literature reviews of studies addressing surgeries scheduling most often also address related SD predictive methods. Here are some examples (latest first). The rest of this entry review various perspectives associated with the process of producing SD predictions — ''SD statistical distributions'', ''Methods to reduce SD variability (stratification and covariates)'', ''Predictive models and methods'', and ''Surgery as a work-process''. The latter addresses surgery characterization as a work-process (repetitive, semi-repetitive or memoryless) and its effect on SD distributional shape.

SD Statistical Distributions

Theoretical models

A most straightforward SD predictive method comprises specifying a set of existent statistical distributions, and based on available data and distribution-fitting criteria select the most fitting distribution. There is a large volume of comparative studies that attempt to select the most fitting models for SD distribution. Distributions most frequently addressed are the

normal Normal(s) or The Normal(s) may refer to: Film and television * ''Normal'' (2003 film), starring Jessica Lange and Tom Wilkinson * ''Normal'' (2007 film), starring Carrie-Anne Moss, Kevin Zegers, Callum Keith Rennie, and Andrew Airlie * ''Norma ...

, the three-parameter

lognormal In probability theory, a log-normal (or lognormal) distribution is a continuous probability distribution of a random variable whose logarithm is normally distributed. Thus, if the random variable is log-normally distributed, then has a normal ...

gamma Gamma (uppercase , lowercase ; ''gámma'') is the third letter of the Greek alphabet. In the system of Greek numerals it has a value of 3. In Ancient Greek, the letter gamma represented a voiced velar stop . In Modern Greek, this letter re ...

(including the exponential) and

Weibull Weibull is a Swedish locational surname. The Weibull family share the same roots as the Danish / Norwegian noble family of Falsenbr>They originated from and were named after the village of Weiböl in Widstedts parish, Jutland, but settled in Skån ...

. Less frequent "trial" distributions (for fitting purposes) are the loglogistic model, Burr, generalized gamma and the piecewise-constant hazard model. Attempts to presenting SD distribution as a mixture-distribution have also been reported (normal-normal, lognormal-lognormal and Weibull–Gamma mixtures). Occasionally, predictive methods are developed that are valid for a general SD distribution, or more advanced techniques, like

Kernel Density Estimation In statistics, kernel density estimation (KDE) is the application of kernel smoothing for probability density estimation, i.e., a non-parametric method to estimate the probability density function of a random variable based on ''kernels'' as w ...

(KDE), are used instead of the traditional methods (like distribution-fitting or regression-oriented methods). There is broad consensus that the three-parameter lognormal describes best most SD distributions. A new ''family'' of SD distributions, which includes the normal, lognormal and exponential as exact special cases, has recently been developed. Here are some examples (latest first).

Using historical records to specify an empirical distribution

As an alternative to specifying a theoretical distribution as model for SD, one may use records to construct a histogram of available data, and use the related

empirical distribution function In statistics, an empirical distribution function (commonly also called an empirical Cumulative Distribution Function, eCDF) is the distribution function associated with the empirical measure of a sample. This cumulative distribution function ...

(the cumulative plot) to estimate various required percentiles (like the

or the third

quartile In statistics, a quartile is a type of quantile which divides the number of data points into four parts, or ''quarters'', of more-or-less equal size. The data must be ordered from smallest to largest to compute quartiles; as such, quartiles are a ...

). Historical records/expert estimates may also be used to specify location and scale parameters, without specifying a model for SD distribution.

Data mining methods

These methods have recently gained traction as an alternative to specifying in-advance a theoretical model to describe SD distribution for all types of surgeries. Examples are detailed below ("Predictive models and methods").

Reducing SD variability (stratification and covariates)

To enhance SD prediction accuracy, two major approaches are pursued to reduce SD data variability: ''Stratification'' and ''covariates'' (incorporated in the predictive model). Covariates are often referred to in the literature also as factors, effects, explanatory variables or predictors.

Stratification

The term means that available data are divided (stratified) into subgroups, according to a criterion statistically shown to affect SD distribution. The predictive method then aims to produce SD prediction for specified subgroups, having SD with appreciably reduced variability. Examples for stratification criteria are medical specialty,

Procedure Code Procedure codes are a sub-type of medical classification used to identify specific surgical, medical, or diagnostic interventions. The structure of the codes will depend on the classification; for example some use a numerical system, others alph ...

systems, patient-severity condition or hospital/surgeon/technology (with resulting models related to as hospital-specific, surgeon-specific or technology-specific). Examples for implementation are

Current Procedural Terminology The Current Procedural Terminology (CPT) code set is a procedural code set developed by the American Medical Association (AMA). It is maintained by the CPT Editorial Panel. The CPT code set describes medical, surgical, and diagnostic services and ...

(CPT) and

ICD-9-CM The International Classification of Diseases (ICD) is a globally used diagnostic tool for epidemiology, health management and clinical purposes. The ICD is maintained by the World Health Organization (WHO), which is the directing and coordinating ...

Diagnosis and Procedure Codes (International Classification of Diseases, 9th Revision, Clinical Modification).

Covariates (factors, effects, explanatory variables, predictors)

This approach to reduce variability incorporates covariates in the prediction model. The same predictive method may then be more generally applied, with covariates assuming different values for different levels of the factors shown to affect SD distribution (usually by affecting a location parameter, like the mean, and, more rarely, also a scale parameter, like the variance). A most basic method to incorporate covariates into a predictive method is to assume that SD distribution is lognormally distributed. The logged data (taking log of SD data) then represent a normally distributed population, allowing use of multiple- linear-regression to detect statistically significant factors. Other regression methods, which do not require data normality or are robust to its violation (

generalized linear model In statistics, a generalized linear model (GLM) is a flexible generalization of ordinary linear regression. The GLM generalizes linear regression by allowing the linear model to be related to the response variable via a ''link function'' and b ...

nonlinear regression In statistics, nonlinear regression is a form of regression analysis in which observational data are modeled by a function which is a nonlinear combination of the model parameters and depends on one or more independent variables. The data are fit ...

) and artificial intelligence methods have also been used (references sorted chronologically, latest first).

Predictive models and methods

Following is a ''representative'' (non-exhaustive) list of models and methods employed to produce SD predictions (in no particular order). These, or a mixture thereof, may be found in the sample of representative references below:

Linear regression In statistics, linear regression is a linear approach for modelling the relationship between a scalar response and one or more explanatory variables (also known as dependent and independent variables). The case of one explanatory variable is call ...

(LR); Multivariate adaptive regression splines (MARS);

Random forest Random forests or random decision forests is an ensemble learning method for classification, regression and other tasks that operates by constructing a multitude of decision trees at training time. For classification tasks, the output of th ...

s (RF);

Machine learning Machine learning (ML) is a field of inquiry devoted to understanding and building methods that 'learn', that is, methods that leverage data to improve performance on some set of tasks. It is seen as a part of artificial intelligence. Machine ...

; Data mining (rough sets, neural networks);

Knowledge discovery Knowledge extraction is the creation of Knowledge representation and reasoning, knowledge from structured (relational databases, XML) and unstructured (text (literary theory), text, documents, images) sources. The resulting knowledge needs to be in ...

in databases (KDD); Data warehouse model (used to extract data from various, possibly non-interacting, databases);

Kernel density estimation In statistics, kernel density estimation (KDE) is the application of kernel smoothing for probability density estimation, i.e., a non-parametric method to estimate the probability density function of a random variable based on ''kernels'' as w ...

(KDE); Jackknife;

Monte Carlo simulation Monte Carlo methods, or Monte Carlo experiments, are a broad class of computational algorithms that rely on repeated random sampling to obtain numerical results. The underlying concept is to use randomness to solve problems that might be determini ...

Surgery as work-process (repetitive, semi-repetitive,
memoryless In probability and statistics, memorylessness is a property of certain probability distributions. It usually refers to the cases when the distribution of a "waiting time" until a certain event does not depend on how much time has elapsed already ...
)

Surgery is a work process, and likewise it requires inputs to achieve the desired output, a recuperating post-surgery patient. Examples of work-process inputs, from

Production Engineering Manufacturing engineering or production engineering is a branch of professional engineering that shares many common concepts and ideas with other fields of engineering such as mechanical, chemical, electrical, and industrial engineering. Manufa ...

, are the five M's — "money, manpower, materials, machinery, methods" (where "manpower" refers to the human element in general). Like all work-processes in industry and the services, surgeries also have a certain characteristic work-content, which may be unstable to various degrees (within the defined statistical population to which the prediction method aims). This generates a source for SD variability that affects SD distributional shape (from the

normal distribution In statistics, a normal distribution or Gaussian distribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is : f(x) = \frac e^ The parameter \mu ...

, for purely repetitive processes, to the exponential, for purely memoryless processes). Ignoring this source may confound its variability with that due to covariates (as detailed earlier). Therefore, as all work-processes may be partitioned into three types (repetitive, semi-repetitive, memoryless), surgeries may be similarly partitioned. A stochastic model that takes account of work-content instability has recently been developed, which delivers a family of distributions, with the normal/lognormal and exponential as exact special cases. This model was applied to construct a statistical process control scheme for SD.{{Cite journal, last=Shore, first=Haim, date=2021-12-13, title=Estimating operating room utilisation rate for differently distributed surgery times, url=https://doi.org/10.1080/00207543.2021.2009141, journal=International Journal of Production Research, volume=61 , issue=2 , pages=447–461, doi=10.1080/00207543.2021.2009141, s2cid=245200753, issn=0020-7543

References

Prediction Health care management Surgery Hospitals Health informatics Health Resources and Services Administration