HOME

TheInfoList



OR:

Recursive partitioning is a
statistical Statistics (from German: ''Statistik'', "description of a state, a country") is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, industria ...
method for
multivariable analysis Multivariate statistics is a subdivision of statistics encompassing the simultaneous observation and analysis of more than one outcome variable. Multivariate statistics concerns understanding the different aims and background of each of the dif ...
. Recursive partitioning creates a
decision tree A decision tree is a decision support tool that uses a tree-like model of decisions and their possible consequences, including chance event outcomes, resource costs, and utility. It is one way to display an algorithm that only contains condit ...
that strives to correctly classify members of the population by splitting it into sub-populations based on several dichotomous
independent variable Dependent and independent variables are variables in mathematical modeling, statistical modeling and experimental sciences. Dependent variables receive this name because, in an experiment, their values are studied under the supposition or demand ...
s. The process is termed
recursive Recursion (adjective: ''recursive'') occurs when a thing is defined in terms of itself or of its type. Recursion is used in a variety of disciplines ranging from linguistics to logic. The most common application of recursion is in mathematics ...
because each sub-population may in turn be split an indefinite number of times until the splitting process terminates after a particular stopping criterion is reached. Recursive partitioning methods have been developed since the 1980s. Well known methods of recursive partitioning include Ross Quinlan's
ID3 algorithm In decision tree learning, ID3 (Iterative Dichotomiser 3) is an algorithm invented by Ross QuinlanQuinlan, J. R. 1986. Induction of Decision Trees. Mach. Learn. 1, 1 (Mar. 1986), 81–106 used to generate a decision tree from a dataset. ID3 is the ...
and its successors, C4.5 and C5.0 and Classification and Regression Trees (CART).
Ensemble learning In statistics and machine learning, ensemble methods use multiple learning algorithms to obtain better predictive performance than could be obtained from any of the constituent learning algorithms alone. Unlike a statistical ensemble in statisti ...
methods such as
Random Forest Random forests or random decision forests is an ensemble learning method for classification, regression and other tasks that operates by constructing a multitude of decision trees at training time. For classification tasks, the output of th ...
s help to overcome a common criticism of these methods – their vulnerability to
overfitting mathematical modeling, overfitting is "the production of an analysis that corresponds too closely or exactly to a particular set of data, and may therefore fail to fit to additional data or predict future observations reliably". An overfitt ...
of the data – by employing different algorithms and combining their output in some way. This article focuses on recursive partitioning for medical
diagnostic Diagnosis is the identification of the nature and cause of a certain phenomenon. Diagnosis is used in many different disciplines, with variations in the use of logic, analytics, and experience, to determine " cause and effect". In systems engine ...
tests, but the technique has far wider applications. See
decision tree A decision tree is a decision support tool that uses a tree-like model of decisions and their possible consequences, including chance event outcomes, resource costs, and utility. It is one way to display an algorithm that only contains condit ...
. As compared to regression analysis, which creates a formula that health care providers can use to calculate the probability that a patient has a disease, recursive partition creates a rule such as 'If a patient has finding x, y, or z they probably have disease q'. A variation is 'Cox linear recursive partitioning'.


Advantages and disadvantages

Compared to other multivariable methods, recursive partitioning has advantages and disadvantages. *Advantages are: **Generates clinically more intuitive models that do not require the user to perform calculations. **Allows varying prioritizing of misclassifications in order to create a decision rule that has more sensitivity or specificity. **May be more accurate. *Disadvantages are: ** Does not work well for continuous variables ** May overfit data.


Examples

Examples are available of using recursive partitioning in research of diagnostic tests. Goldman used recursive partitioning to prioritize sensitivity in the diagnosis of
myocardial infarction A myocardial infarction (MI), commonly known as a heart attack, occurs when blood flow decreases or stops to the coronary artery of the heart, causing damage to the heart muscle. The most common symptom is chest pain or discomfort which may ...
among patients with chest pain in the emergency room.


See also

*
Decision tree learning Decision tree learning is a supervised learning approach used in statistics, data mining and machine learning. In this formalism, a classification or regression decision tree is used as a predictive model to draw conclusions about a set of obse ...


References

{{reflist Statistical classification Biostatistics