latent class analysis
   HOME

TheInfoList



OR:

In statistics, a latent class model (LCM) relates a set of observed (usually discrete)
multivariate Multivariate may refer to: In mathematics * Multivariable calculus * Multivariate function * Multivariate polynomial In computing * Multivariate cryptography * Multivariate division algorithm * Multivariate interpolation * Multivariate optical c ...
variables to a set of
latent variable In statistics, latent variables (from Latin: present participle of ''lateo'', “lie hidden”) are variables that can only be inferred indirectly through a mathematical model from other observable variables that can be directly observed or me ...
s. It is a type of
latent variable model A latent variable model is a statistical model that relates a set of observable variables (also called ''manifest variables'' or ''indicators'') to a set of latent variables. It is assumed that the responses on the indicators or manifest variabl ...
. It is called a latent class model because the latent variable is discrete. A class is characterized by a pattern of
conditional probabilities In probability theory, conditional probability is a measure of the probability of an event occurring, given that another event (by assumption, presumption, assertion or evidence) has already occurred. This particular method relies on event B occur ...
that indicate the chance that variables take on certain values. Latent class analysis (LCA) is a subset of structural equation modeling, used to find groups or subtypes of cases in multivariate
categorical data In statistics, a categorical variable (also called qualitative variable) is a variable that can take on one of a limited, and usually fixed, number of possible values, assigning each individual or other unit of observation to a particular group or ...
. These subtypes are called "latent classes".Lazarsfeld, P.F. and Henry, N.W. (1968) ''Latent structure analysis''. Boston: Houghton Mifflin Formann, A. K. (1984). ''Latent Class Analyse: Einführung in die Theorie und Anwendung atent class analysis: Introduction to theory and application'. Weinheim: Beltz. Confronted with a situation as follows, a researcher might choose to use LCA to understand the data: Imagine that symptoms a-d have been measured in a range of patients with diseases X, Y, and Z, and that disease X is associated with the presence of symptoms a, b, and c, disease Y with symptoms b, c, d, and disease Z with symptoms a, c and d. The LCA will attempt to detect the presence of latent classes (the disease entities), creating patterns of association in the symptoms. As in
factor analysis Factor analysis is a statistical method used to describe variability among observed, correlated variables in terms of a potentially lower number of unobserved variables called factors. For example, it is possible that variations in six observed ...
, the LCA can also be used to classify case according to their
maximum likelihood In statistics, maximum likelihood estimation (MLE) is a method of estimating the parameters of an assumed probability distribution, given some observed data. This is achieved by maximizing a likelihood function so that, under the assumed stat ...
class membership. Because the criterion for solving the LCA is to achieve latent classes within which there is no longer any association of one symptom with another (because the class is the disease which causes their association), and the set of diseases a patient has (or class a case is a member of) causes the symptom association, the symptoms will be "conditionally independent", i.e., conditional on class membership, they are no longer related.


Model

Within each latent class, the observed variables are
statistically independent Independence is a fundamental notion in probability theory, as in statistics and the theory of stochastic processes. Two events are independent, statistically independent, or stochastically independent if, informally speaking, the occurrence of o ...
. This is an important aspect. Usually the observed variables are statistically dependent. By introducing the latent variable, independence is restored in the sense that within classes variables are independent ( local independence). We then say that the association between the observed variables is explained by the classes of the latent variable (McCutcheon, 1987). In one form, the latent class model is written as : p_ \approx \sum_t^T p_t \, \prod_n^N p^n_, where T is the number of latent classes and p_t are the so-called recruitment or unconditional probabilities that should sum to one. p^n_ are the marginal or conditional probabilities. For a two-way latent class model, the form is : p_ \approx \sum_t^T p_t \, p_ \, p_. This two-way model is related to
probabilistic latent semantic analysis Probabilistic latent semantic analysis (PLSA), also known as probabilistic latent semantic indexing (PLSI, especially in information retrieval circles) is a statistical technique for the analysis of two-mode and co-occurrence data. In effect, one ca ...
and
non-negative matrix factorization Non-negative matrix factorization (NMF or NNMF), also non-negative matrix approximation is a group of algorithms in multivariate analysis and linear algebra where a matrix is factorized into (usually) two matrices and , with the property that ...
.


Related methods

There are a number of methods with distinct names and uses that share a common relationship. Cluster analysis is, like LCA, used to discover taxon-like groups of cases in data. Multivariate mixture estimation (MME) is applicable to continuous data, and assumes that such data arise from a mixture of distributions: imagine a set of heights arising from a mixture of men and women. If a multivariate mixture estimation is constrained so that measures must be uncorrelated within each distribution it is termed
latent profile analysis In statistics, a mixture model is a probabilistic model for representing the presence of subpopulations within an overall population, without requiring that an observed data set should identify the sub-population to which an individual observati ...
. Modified to handle discrete data, this constrained analysis is known as LCA. Discrete latent trait models further constrain the classes to form from segments of a single dimension: essentially allocating members to classes on that dimension: an example would be assigning cases to social classes on a dimension of ability or merit. As a practical instance, the variables could be
multiple choice Multiple choice (MC), objective response or MCQ (for multiple choice question) is a form of an objective assessment in which respondents are asked to select only correct answers from the choices offered as a list. The multiple choice format is mo ...
items of a political questionnaire. The data in this case consists of a N-way contingency table with answers to the items for a number of respondents. In this example, the latent variable refers to political opinion and the latent classes to political groups. Given group membership, the
conditional probabilities In probability theory, conditional probability is a measure of the probability of an event occurring, given that another event (by assumption, presumption, assertion or evidence) has already occurred. This particular method relies on event B occur ...
specify the chance certain answers are chosen.


Application

LCA may be used in many fields, such as:
collaborative filtering Collaborative filtering (CF) is a technique used by recommender systems.Francesco Ricci and Lior Rokach and Bracha ShapiraIntroduction to Recommender Systems Handbook Recommender Systems Handbook, Springer, 2011, pp. 1-35 Collaborative filtering ...
, ''
Behavior Genetics Behavioural genetics, also referred to as behaviour genetics, is a field of scientific research that uses genetic methods to investigate the nature and origins of individual differences in behaviour. While the name "behavioural genetics" ...
'' an
Evaluation of diagnostic tests


References

* * * *


External links

* Statistical Innovations
Home Page
2016. Website with latent class software (Latent GOLD 5.1), free demonstrations, tutorials, user guides, and publications for download. Also included: online courses, FAQs, and other related software. * The Methodology Center
Latent Class Analysis
a research center at Penn State, free software, FAQ * John Uebersax
Latent Class Analysis
2006. A web-site with bibliography, software, links and FAQ for latent class analysis {{DEFAULTSORT:Latent Class Model Classification algorithms Latent variable models Market research Market segmentation