HOME

TheInfoList



OR:

Chi-square automatic interaction detection (CHAID) is a
decision tree A decision tree is a decision support system, decision support recursive partitioning structure that uses a Tree (graph theory), tree-like Causal model, model of decisions and their possible consequences, including probability, chance event ou ...
technique based on adjusted significance testing (
Bonferroni correction In statistics, the Bonferroni correction is a method to counteract the multiple comparisons problem. Background The method is named for its use of the Bonferroni inequalities. Application of the method to confidence intervals was described by ...
, Holm-Bonferroni testing).


History

CHAID is based on a formal extension of AID (Automatic Interaction Detection) and THAID (THeta Automatic Interaction Detection) procedures of the 1960s and 1970s, which in turn were extensions of earlier research, including that performed by Belson in the UK in the 1950s. In 1975, the CHAID technique itself was developed in South Africa. It was published in 1980 by Gordon V. Kass, who had completed a PhD thesis on the topic. A history of earlier supervised tree methods can be found in Ritschard, including a detailed description of the original CHAID algorithm and the exhaustive CHAID extension by Biggs, De Ville, and Suen.


Properties

CHAID can be used for prediction (in a similar fashion to regression analysis, this version of CHAID being originally known as XAID) as well as classification, and for detection of interaction between variables. In practice, CHAID is often used in the context of
direct marketing Direct marketing is a form of communicating an offer, where organizations communicate directly to a Target market, pre-selected customer and supply a method for a direct response. Among practitioners, it is also known as ''direct response ...
to select groups of consumers to predict how their responses to some variables affect other variables, although other early applications were in the fields of medical and psychiatric research. Like other decision trees, CHAID's advantages are that its output is highly visual and easy to interpret. Because it uses multiway splits by default, it needs rather large sample sizes to work effectively, since with small sample sizes the respondent groups can quickly become too small for reliable analysis. One important advantage of CHAID over alternatives such as multiple regression is that it is non-parametric.


See also

*
Bonferroni correction In statistics, the Bonferroni correction is a method to counteract the multiple comparisons problem. Background The method is named for its use of the Bonferroni inequalities. Application of the method to confidence intervals was described by ...
*
Chi-squared distribution In probability theory and statistics, the \chi^2-distribution with k Degrees of freedom (statistics), degrees of freedom is the distribution of a sum of the squares of k Independence (probability theory), independent standard normal random vari ...
*
Decision tree learning Decision tree learning is a supervised learning approach used in statistics, data mining and machine learning. In this formalism, a classification or regression decision tree is used as a predictive model to draw conclusions about a set of obser ...
* Latent class model *
Market segment In marketing, market segmentation or customer segmentation is the process of dividing a consumer or business market into meaningful sub-groups of current or potential customers (or consumers) known as ''segments''. Its purpose is to identify pr ...
*
Multiple comparisons Multiple comparisons, multiplicity or multiple testing problem occurs in statistics when one considers a set of statistical inferences simultaneously or estimates a subset of parameters selected based on the observed values. The larger the numbe ...
*
Structural equation modeling Structural equation modeling (SEM) is a diverse set of methods used by scientists for both observational and experimental research. SEM is used mostly in the social and behavioral science fields, but it is also used in epidemiology, business, ...


References

{{reflist, 1


Bibliography

* Press, Laurence I.; Rogers, Miles S.; & Shure, Gerald H.; ''An interactive technique for the analysis of multivariate data'', Behavioral Science, Vol. 14 (1969), pp. 364–370 * Hawkins, Douglas M.; and Kass, Gordon V.; ''Automatic Interaction Detection'', in Hawkins, Douglas M. (ed), ''Topics in Applied Multivariate Analysis'', Cambridge University Press, Cambridge, 1982, pp. 269–302 * Hooton, Thomas M.; Haley, Robert W.; Culver, David H.; White, John W.; Morgan, W. Meade; & Carroll, Raymond J.; ''The Joint Associations of Multiple Risk Factors with the Occurrence of Nosocomial Infections'', American Journal of Medicine, Vol. 70, (1981), pp. 960–970 * Brink, Susanne; & Van Schalkwyk, Dirk J.; ''Serum ferritin and mean corpuscular volume as predictors of bone marrow iron stores'', South African Medical Journal, Vol. 61, (1982), pp. 432–434 * McKenzie, Dean P.; McGorry, Patrick D.; Wallace, Chris S.; Low, Lee H.; Copolov, David L.; & Singh, Bruce S.; ''Constructing a Minimal Diagnostic Decision Tree'', Methods of Information in Medicine, Vol. 32 (1993), pp. 161–166 * Magidson, Jay; ''The CHAID approach to segmentation modeling: chi-squared automatic interaction detection'', in Bagozzi, Richard P. (ed); ''Advanced Methods of Marketing Research'', Blackwell, Oxford, GB, 1994, pp. 118–159 * Hawkins, Douglas M.; Young, S. S.; & Rosinko, A.; ''Analysis of a large structure-activity dataset using recursive partitioning'', Quantitative Structure-Activity Relationships, Vol. 16, (1997), pp. 296–302


External lkinks

* Luchman, J.N.; ''CHAID: Stata module to conduct chi-square automated interaction detection'', Available for fre
download
or type within Stata: ssc install chaid. * Luchman, J.N.; ''CHAIDFOREST: Stata module to conduct random forest ensemble classification based on chi-square automated interaction detection (CHAID) as base learner'', Available for fre

or type within Stata: ssc install chaidforest.
IBM SPSS Decision Trees
grows exhaustive CHAID trees as well as a few other types of trees such as CART. * An R package
CHAID
' is available on R-Forge. Market research Market segmentation Statistical algorithms Statistical classification Decision trees Classification algorithms