Chunklet
   HOME

TheInfoList



OR:

In
computer science Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, information theory, and automation) to Applied science, practical discipli ...
, constrained clustering is a class of
semi-supervised learning Weak supervision is a branch of machine learning where noisy, limited, or imprecise sources are used to provide supervision signal for labeling large amounts of training data in a supervised learning setting. This approach alleviates the burden of o ...
algorithms. Typically, constrained clustering incorporates either a set of must-link constraints, cannot-link constraints, or both, with a
data clustering Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense) to each other than to those in other groups (clusters). It is a main task of ...
algorithm. A cluster in which the members conform to all must-link and cannot-link constraints is called a chunklet.


Types of constraints

Both a must-link and a cannot-link constraint define a relationship between two data instances. Together, the sets of these constraints act as a guide for which a constrained clustering algorithm will attempt to find chunklets (clusters in the dataset which satisfy the specified constraints). * A ''must-link constraint'' is used to specify that the two instances in the must-link relation should be associated with the same cluster. * A ''cannot-link constraint'' is used to specify that the two instances in the cannot-link relation should ''not'' be associated with the same cluster. Some constrained clustering algorithms will abort if no such clustering exists which satisfies the specified constraints. Others will try to minimize the amount of constraint violation should it be impossible to find a clustering which satisfies the constraints. Constraints could also be used to guide the selection of a clustering model among several possible solutions.


Examples

Examples of constrained clustering algorithms include: * COP K-means * PCKmeans (Pairwise Constrained K-means) * CMWK-Means (Constrained Minkowski Weighted K-Means)


References

Cluster analysis algorithms Cluster analysis {{comp-sci-stub