Chi-square Automatic Interaction Detection

	Chi-square Automatic Interaction Detection Chi-square automatic interaction detection (CHAID) is a decision tree technique based on adjusted significance testing ( Bonferroni correction, Holm-Bonferroni testing). The technique was developed in South Africa and was published in 1980 by Gordon V. Kass, who had completed a PhD thesis on this topic. CHAID can be used for prediction (in a similar fashion to regression analysis, this version of CHAID being originally known as XAID) as well as classification, and for detection of interaction between variables. CHAID is based on a formal extension of AID (Automatic Interaction Detection) and THAID (THeta Automatic Interaction Detection) procedures of the 1960s and 1970s, which in turn were extensions of earlier research, including that performed by Belson in the UK in the 1950s. A history of earlier supervised tree methods together with a detailed description of the original CHAID algorithm and the exhaustive CHAID extension by Biggs, De Ville, and Suen, can be found in Ritschard. ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Decision Tree Learning Decision tree learning is a supervised learning approach used in statistics, data mining and machine learning. In this formalism, a classification or regression decision tree is used as a predictive model to draw conclusions about a set of observations. Tree models where the target variable can take a discrete set of values are called classification trees; in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. Decision trees where the target variable can take continuous values (typically real numbers) are called regression trees. Decision trees are among the most popular machine learning algorithms given their intelligibility and simplicity. In decision analysis, a decision tree can be used to visually and explicitly represent decisions and decision making. In data mining, a decision tree describes data (but the resulting classification tree can be an input for decision making). General Dec ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Bonferroni Correction In statistics, the Bonferroni correction is a method to counteract the multiple comparisons problem. Background The method is named for its use of the Bonferroni inequalities. An extension of the method to confidence intervals was proposed by Olive Jean Dunn. Statistical hypothesis testing is based on rejecting the null hypothesis if the likelihood of the observed data under the null hypotheses is low. If multiple hypotheses are tested, the probability of observing a rare event increases, and therefore, the likelihood of incorrectly rejecting a null hypothesis (i.e., making a Type I error) increases. The Bonferroni correction compensates for that increase by testing each individual hypothesis at a significance level of \alpha/m, where \alpha is the desired overall alpha level and m is the number of hypotheses. For example, if a trial is testing m = 20 hypotheses with a desired \alpha = 0.05, then the Bonferroni correction would test each individual hypothesis at \alpha = 0.05/20 = ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Regression Analysis In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships between a dependent variable (often called the 'outcome' or 'response' variable, or a 'label' in machine learning parlance) and one or more independent variables (often called 'predictors', 'covariates', 'explanatory variables' or 'features'). The most common form of regression analysis is linear regression, in which one finds the line (or a more complex linear combination) that most closely fits the data according to a specific mathematical criterion. For example, the method of ordinary least squares computes the unique line (or hyperplane) that minimizes the sum of squared differences between the true data and that line (or hyperplane). For specific mathematical reasons (see linear regression), this allows the researcher to estimate the conditional expectation (or population average value) of the dependent variable when the independent variables take on a given ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Direct Marketing Direct marketing is a form of communicating an offer, where organizations communicate directly to a pre-selected customer and supply a method for a direct response. Among practitioners, it is also known as ''direct response marketing''. By contrast, advertising is of a mass-message nature. Response channels include toll-free telephone numbers, reply cards, reply forms to be sent in an envelope, websites and email addresses. The prevalence of direct marketing and the unwelcome nature of some communications has led to regulations and laws such as the CAN-SPAM Act, requiring that consumers in the United States be allowed to opt-out. Overview Intended targets are selected from larger populations based on vendor-defined criteria, including average income for a particular ZIP code, purchasing history and presence on other lists. The goal is "to sell directly to consumers" without letting others "join (the) parade." Popularity A 2010 study by the Direct Marketing Associatio ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Chi-squared Distribution In probability theory and statistics, the chi-squared distribution (also chi-square or \chi^2-distribution) with k degrees of freedom is the distribution of a sum of the squares of k independent standard normal random variables. The chi-squared distribution is a special case of the gamma distribution and is one of the most widely used probability distributions in inferential statistics, notably in hypothesis testing and in construction of confidence intervals. This distribution is sometimes called the central chi-squared distribution, a special case of the more general noncentral chi-squared distribution. The chi-squared distribution is used in the common chi-squared tests for goodness of fit of an observed distribution to a theoretical one, the independence of two criteria of classification of qualitative data, and in confidence interval estimation for a population standard deviation of a normal distribution from a sample standard deviation. Many other statistical tests a ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Latent Class Model In statistics, a latent class model (LCM) relates a set of observed (usually discrete) multivariate variables to a set of latent variables. It is a type of latent variable model. It is called a latent class model because the latent variable is discrete. A class is characterized by a pattern of conditional probabilities that indicate the chance that variables take on certain values. Latent class analysis (LCA) is a subset of structural equation modeling, used to find groups or subtypes of cases in multivariate categorical data. These subtypes are called "latent classes".Lazarsfeld, P.F. and Henry, N.W. (1968) ''Latent structure analysis''. Boston: Houghton Mifflin Formann, A. K. (1984). ''Latent Class Analyse: Einführung in die Theorie und Anwendung atent class analysis: Introduction to theory and application'. Weinheim: Beltz. Confronted with a situation as follows, a researcher might choose to use LCA to understand the data: Imagine that symptoms a-d have been measured in a range ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Structural Equation Modeling Structural equation modeling (SEM) is a label for a diverse set of methods used by scientists in both experimental and observational research across the sciences, business, and other fields. It is used most in the social and behavioral sciences. A definition of SEM is difficult without reference to highly technical language, but a good starting place is the name itself. SEM involves the construction of a ''model'', to represent how various aspects of an observable or theoretical phenomenon are thought to be causally structurally related to one another. The ''structural'' aspect of the model implies theoretical associations between variables that represent the phenomenon under investigation. The postulated causal structuring is often depicted with arrows representing causal connections between variables (as in Figures 1 and 2) but these causal connections can be equivalently represented as equations. The causal structures imply that specific patterns of connections should appe ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Market Segment In marketing, market segmentation is the process of dividing a broad consumer or business market, normally consisting of existing and potential customers, into sub-groups of consumers (known as ''segments'') based on some type of shared characteristics. In dividing or segmenting markets, researchers typically look for common characteristics such as shared needs, common interests, similar lifestyles, or even similar demographic profiles. The overall aim of segmentation is to identify ''high yield segments'' – that is, those segments that are likely to be the most profitable or that have growth potential – so that these can be selected for special attention (i.e. become target markets). Many different ways to segment a market have been identified. Business-to-business (B2B) sellers might segment the market into different types of businesses or countries, while business-to-consumer (B2C) sellers might segment the market into demographic segments, such as lifestyle, behavior, ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Multiple Comparisons In statistics, the multiple comparisons, multiplicity or multiple testing problem occurs when one considers a set of statistical inferences simultaneously or infers a subset of parameters selected based on the observed values. The more inferences are made, the more likely erroneous inferences become. Several statistical techniques have been developed to address that problem, typically by requiring a stricter significance threshold for individual comparisons, so as to compensate for the number of inferences being made. History The problem of multiple comparisons received increased attention in the 1950s with the work of statisticians such as Tukey and Scheffé. Over the ensuing decades, many procedures were developed to address the problem. In 1996, the first international conference on multiple comparison procedures took place in Israel. Definition Multiple comparisons arise when a statistical analysis involves multiple simultaneous statistical tests, each of which has a potent ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
	Market Research Market research is an organized effort to gather information about target markets and customers: know about them, starting with who they are. It is an important component of business strategy and a major factor in maintaining competitiveness. Market research helps to identify and analyze the needs of the market, the market size and the competition. Its techniques encompass both qualitative techniques such as focus groups, in-depth interviews, and ethnography, as well as quantitative techniques such as customer surveys, and analysis of secondary data. It includes social and opinion research, and is the systematic gathering and interpretation of information about individuals or organizations using statistical and analytical methods and techniques of the applied social sciences to gain insight or support decision making. Market research, marketing research, and marketing are a sequence of business activities; sometimes these are handled informally. The field of ''marketing researc ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]
picture info	Market Segmentation In marketing, market segmentation is the process of dividing a broad consumer or business market, normally consisting of existing and potential customers, into sub-groups of consumers (known as ''segments'') based on some type of shared characteristics. In dividing or segmenting markets, researchers typically look for common characteristics such as shared needs, common interests, similar lifestyles, or even similar demographic profiles. The overall aim of segmentation is to identify ''high yield segments'' – that is, those segments that are likely to be the most profitable or that have growth potential – so that these can be selected for special attention (i.e. become target markets). Many different ways to segment a market have been identified. Business-to-business (B2B) sellers might segment the market into different types of businesses or countries, while business-to-consumer (B2C) sellers might segment the market into demographic segments, such as lifestyle, behavior, ... [...More Info...] [...Related Items...] OR: [Wikipedia] [Google] [Baidu]