Balanced clustering is a special case of
clustering where, in the strictest sense, cluster sizes are constrained to
or
, where
is the number of points and
is the number of clusters. A typical algorithm is balanced
k-means
''k''-means clustering is a method of vector quantization, originally from signal processing, that aims to partition ''n'' observations into ''k'' clusters in which each observation belongs to the cluster with the nearest mean (cluster centers o ...
, which minimizes
mean square error (MSE). Another type of balanced clustering called balance-driven clustering has a two-objective cost function that minimizes both the imbalance and the MSE. Typical cost functions are ratio cut and Ncut. Balanced clustering can be used for example in scenarios where freight has to be delivered to
locations with
cars. It is then preferred that each car delivers to an equal number of locations.
Software
There exists implementations for balanced k-means and Ncut
References
{{cite journal , doi=10.1134/S1064226917120105 , title=On Balanced Clustering (Indices, Models, Examples) , year=2017 , last1=Levin , first1=M. Sh. , journal=Journal of Communications Technology and Electronics , volume=62 , issue=12 , pages=1506–1515 , s2cid=255277095
Clustering criteria